Artificial chromosomes, uses thereof and methods for preparing artificial chromosomes

ABSTRACT

Methods for preparing cell lines that contain artificial chromosomes, methods for preparation of artificial chromosomes, methods for purification of artificial chromosomes, methods for targeted insertion of heterologous DNA into artificial chromosomes, methods for amplification of nucleic acids and methods for delivery of the chromosomes to selected cells and tissues are provided. Also provided are cell lines for use in the methods, and cell lines and chromosomes produced by the methods. Methods for use of the artificial chromosomes are also provided.6

RELATED APPLICATIONS

[0001] This application is a divisional of copending U.S. applicationSer. No. 09/799,462, filed Mar. 5, 2001, to GYULA HADLACZKY and ALADARSZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FORPREPARING ARTIFICIAL CHROMOSOMES.

[0002] This application is a continuation of copending U.S. applicationSer. Nos. 08/835,682 and 09/724,693, filed Apr. 10, 1997 and Nov. 28,2000, respectively, to GYULA HADLACZKY and ALADAR SZALAY, entitledARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARINGARTIFICIAL CHROMOSOMES. This application is also a continuation-in-partof U.S. application Ser. No. 08/695,191, filed Aug. 7, 1996, now U.S.Pat. No. 6,025,155, to GYULA HADLACZKY and ALADAR SZALAY, entitledARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARINGARTIFICIAL CHROMOSOMES. This application is also continuation-in-part ofU.S. application Ser. No. 08/682,080, filed Jul. 15, 1996, now U.S. Pat.No. 6,077,697, to GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIALCHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING ARTIFICIALCHROMOSOMES, and is also a continuation-in-part of copending U.S.application Ser. No. 08/629,822, filed Apr. 10, 1996 to GYULA HADLACZKYand ALADAR SZALAY, entitled ARTIFICIAL CHROMOSOMES, USES THEREOF ANDMETHODS FOR PREPARING ARTIFICIAL CHROMOSOMES.

[0003] U.S. application Ser. No. 09/724,693 is a continuation of U.S.application Ser. No. 08/835,682 , which is a continuation-in-part ofU.S. application Ser. No. 08/695,191; U.S. application Ser. No.08/695,191 is a continuation-in-part of U.S. application Ser. No.08/682,080 and also is a continuation-in-part of U.S. application Ser.No. 08/629,822. U.S. application Ser. No. 08/682,080 is acontinuation-in-part of U.S. application Ser. No. 08/629,822.

[0004] This application is related to U.S. application Ser. No.07/759,558, now U.S. Pat. No. 5,288,625, is related to U.S. applicationSer. No. 08/734,344, filed Oct. 21, 1996, and is related to U.S.application Ser. No. 08/375,271, filed Jan. 19, 1995, now U.S. Pat. No.5,712,134. U.S. application Ser. No. 08/375,271 is a continuation ofU.S. application Ser. No. 08/080,097, filed Jun. 23, 1993 which is acontinuation of U.S. application Ser. No. 07/892,487, filed Jun. 3,1992, which is a continuation of U.S. application Ser. No. 07/521,073,filed May 9. 1990.

[0005] The subject matter of each of the above-noted U.S. applicationsand patents is incorporated in its entirety by reference thereto.

FIELD OF THE INVENTION

[0006] The present invention relates to methods for preparing cell linesthat contain artificial chromosomes, methods for isolation of theartificial chromosomes, targeted insertion of heterologous DNA into thechromosomes, delivery of the chromosomes to selected cells and tissuesand methods for isolation and large-scale production of the chromosomes.Also provided are cell lines for use in the methods, and cell lines andchromosomes produced by the methods. Further provided are cell-basedmethods for production of heterologous proteins, gene therapy methodsand methods of generating transgenic animals, particularly non-humantransgenic animals, that use artificial chromosomes.

BACKGROUND OF THE INVENTION

[0007] Several viral vectors, non-viral, and physical delivery systemsfor gene therapy and recombinant expression of heterologous nucleicacids have been developed [see, e.g., Mitani et al. (1993) TrendsBiotech. 11:162-166]. The presently available systems, however, havenumerous limitations, particularly where persistent, stable, orcontrolled gene expression is required. These limitations include: (1)size limitations because there is a limit, generally on order of aboutten kilobases [kB], at most, to the size of the DNA insert [gene] thatcan be accepted by viral vectors, whereas a number of mammalian genes ofpossible therapeutic importance are well above this limit, especially ifall control elements are included; (2) the inability to specificallytarget integration so that random integration occurs which carries arisk of disrupting vital genes or cancer suppressor genes; (3) theexpression of randomly integrated therapeutic genes may be affected bythe functional compartmentalization in the nucleus and are affected bychromatin-based position effects; (4) the copy number and consequentlythe expression of a given gene to be integrated into the genome cannotbe controlled. Thus, improvements in gene delivery and stable expressionsystems are needed [see, e.g., Mulligan (1993) Science 260:926-932].

[0008] In addition, safe and effective vectors and gene therapy methodsshould have numerous features that are not assured by the presentlyavailable systems. For example, a safe vector should not contain DNAelements that can promote unwanted changes by recombination or mutationin the host genetic material, should not have the potential to initiatedeleterious effects in cells, tissues, or organisms carrying the vector,and should not interfere with genomic functions. In addition, it wouldbe advantageous for the vector to be non-integrative, or designed forsite-specific integration. Also, the copy number of therapeutic gene(s)carried by the vector should be controlled and stable, the vector shouldsecure the independent and controlled function of the introducedgene(s); and the vector should accept large (up to Mb size) inserts andensure the functional stability of the insert.

[0009] The limitations of existing gene delivery technologies, however,argue for the development of alternative vector systems suitable fortransferring large [up to Mb size or larger] genes and gene complexestogether with regulatory elements that will provide a safe, controlled,and persistent expression of the therapeutic genetic material.

[0010] At the present time, none of the available vectors fulfill allthese requirements. Most of these characteristics, however, arepossessed by chromosomes. Thus, an artificial chromosome would be anideal vector for gene therapy, as well as for stable, high-level,controlled production of gene products that require coordination ofexpression of numerous genes or that are encoded by large genes, andother uses. Artificial chromosomes for expression of heterologous genesin yeast are available, but construction of defined mammalian artificialchromosomes has not been achieved. Such construction has been hinderedby the lack of an isolated, functional, mammalian centromere anduncertainty regarding the requisites for its production and stablereplication. Unlike in yeast, there are no selectable genes in closeproximity to a mammalian centromere, and the presence of long runs ofhighly repetitive pericentric heterochromatic DNA makes the isolation ofa mammalian centromere using presently available methods, such aschromosome walking, virtually impossible. Other strategies are requiredfor production of mammalian artificial chromosomes, and some have beendeveloped. For example, U.S. Pat. No. 5,288,625 provides a cell linethat contains an artificial chromosome, a minichromosome, that is about20 to 30 megabases. Methods provided for isolation of these chromosomes,however, provide preparations of only about 10-20% purity. Thus,development of alternative artificial chromosomes and perfection ofisolation and purification methods as well as development of moreversatile chromosomes and further characterization of theminichromosomes is required to realize the potential of this technology.

[0011] Therefore, it is an object herein to provide mammalian artificialchromosomes and methods for introduction of foreign DNA into suchchromosomes. It is also an object herein to provide methods of isolationand purification of the chromosomes. It is also an object herein toprovide methods for introduction of the mammalian artificial chromosomeinto selected cells, and to provide the resulting cells, as well astransgenic non-human animals, birds, fish and plants that contain theartificial chromosomes. It is also an object herein to provide methodsfor gene therapy and expression of gene products using artificialchromosomes. It is a further object herein to provide methods forconstructing species-specific artificial chromosomes de novo. Anotherobject herein is to provide methods to generate de novo mammalianartificial chromosomes.

SUMMARY OF THE INVENTION

[0012] Mammalian artificial chromosomes [MACs] are provided. Alsoprovided are artificial chromosomes for other higher eukaryotic species,such as insects, birds, fowl and fish, produced using the MACS andmethods provided herein. Methods for generating and isolating suchchromosomes are provided. Methods using the MACs to construct artificialchromosomes from other species, such as insect, bird, fowl and fishspecies are also provided. The artificial chromosomes are fullyfunctional stable chromosomes. Two types of artificial chromosomes areprovided. One type, herein referred to as SATACs [satellite artificialchromosomes or satellite DNA based artificial chromosomes (the terms areused interchangeably herein)] are stable heterochromatic chromosomes,and the other type are minichromosomes based on amplification ofeuchromatin.

[0013] Artificial chromosomes provide an extra-genomic locus fortargeted integration of megabase [Mb] pair size DNA fragments thatcontain single or multiple genes, including multiple copies of a singlegene operatively linked to one promoter or each copy or several copieslinked to separate promoters. Thus, methods using the MACs to introducethe genes into cells, tissues, and animals, as well as species such asbirds, fowl, fish and plants, are also provided. The artificialchromosomes with integrated heterologous DNA may be used in methods ofgene therapy, in methods of production of gene products, particularlyproducts that require expression of multigenic biosynthetic pathways,and also are intended for delivery into the nuclei of germline cells,such as embryo-derived stem cells [ES cells], for production oftransgenic (non-human) animals, birds, fowl and fish. Transgenic plants,including monocots and dicots, are also contemplated herein.

[0014] Mammalian artificial chromosomes provide extra-genomic specificintegration sites for introduction of genes encoding proteins ofinterest and permit megabase size DNA integration so that, for example,genes encoding an entire metabolic pathway or a very large gene, such asthe cystic fibrosis [CF; ˜250 kb] genomic DNA gene, several genes, suchas multiple genes encoding a series of antigens for preparation of amultivalent vaccine, can be stably introduced into a cell. Vectors fortargeted introduction of such genes, including the tumor suppressorgenes, such as p53, the cystic fibrosis transmembrane regulator cDNA[CFTR], and the genes for anti-HIV ribozymes, such as an anti-HIV gagribozyme gene, into the artificial chromosomes are also provided.

[0015] The chromosomes provided herein are generated by introducingheterologous DNA that includes DNA encoding one or multiple selectablemarker(s) into cells, preferably a stable cell line, growing the cellsunder selective conditions, and identifying from among the resultingclones those that include chromosomes with more than one centromereand/or fragments thereof. The amplification that produces the additionalcentromere or centromeres occurs in cells that contain chromosomes inwhich the heterologous DNA has integrated near the centromere in thepericentric region of the chromosome. The selected clonal cells are thenused to generate artificial chromosomes.

[0016] Although non-targeted introduction of DNA, which results in somefrequency of integration into appropriate loci, targeted introduction ispreferred. Hence, in preferred embodiments, the DNA with the selectablemarker that is introduced into cells to initiate generation ofartificial chromosomes includes sequences that target it to the anamplifiable region, such as the pericentric region, heterochromatin, andparticularly rDNA of the chromosome. For example, vectors, such aspTEMPUD and pHASPUD [provided herein], which include such DNA specificfor mouse satellite DNA and human satellite DNA, respectively, areprovided. The plasmid pHASPUD is a derivative of pTEMPUD that containshuman satellite DNA sequences that specifically target humanchromosomes. Preferred targeting sequences include mammalian ribosomalRNA (rRNA) gene sequences (referred to herein as rDNA) which target theheterologous DNA to integrate into the rDNA region of those chromosomesthat contain rDNA. For example, vectors, such as pTERPUD, which includemouse rDNA, are provided. Upon integration into existing chromosomes inthe cells, these vectors can induce the amplification that results ingeneration of additional centromeres.

[0017] Artificial chromosomes are generated by culturing the cells withthe multicentric, typically dicentric, chromosomes under conditionswhereby the chromosome breaks to form a minichromosome and formerlydicentric chromosome. Among the MACs provided herein are the SATACs,which are primarily made up of repeating units of short satellite DNAand are nearly fully heterochromatic, so that without insertion ofheterologous or foreign DNA, the chromosomes preferably contain nogenetic information or contain only non-protein-encoding gene sequencessuch as rDNA sequences. They can thus be used as “safe” vectors fordelivery of DNA to mammalian hosts because they do not contain anypotentially harmful genes. The SATACs are generated, not from theminichromosome fragment as, for example, in U.S. Pat. No. 5,288,625, butfrom the fragment of the formerly dicentric chromosome.

[0018] In addition, methods for generating euchromatic minichromosomesand the use thereof are also provided herein. Methods for generating onetype of MAC, the minichromosome, previously described in U.S. Pat. No.5,288,625, and the use thereof for expression of heterologous DNA areprovided. In a particular method provided herein for generating a MAC,such as a minichromosome, heterologous DNA that includes mammalian rDNAand one or more selectable marker genes is introduced into cells whichare then grown under selective conditions. Resulting cells that containchromosomes with more than one centromere are selected and culturedunder conditions whereby the chromosome breaks to form a minichromosomeand a formerly multicentric (typically dicentric) chromosome from whichthe minichromosome was released.

[0019] Cell lines containing the minichromosome and the use thereof forcell fusion are also provided. In one embodiment, a cell line containingthe mammalian minichromosome is used as recipient cells for donor DNAencoding a selected gene or multiple genes. To facilitate integration ofthe donor DNA into the minichromosome, the recipient cell linepreferably contains the minichromosome but does not also contain theformerly dicentric chromosome. This may be accomplished by methodsdisclosed herein such as cell fusion and selection of cells that containa minichromosome and no formerly dicentric chromosome. The donor DNA islinked to a second selectable marker and is targeted to and integratedinto the minichromosome. The resulting chromosome is transferred by cellfusion into an appropriate recipient cell line, such as a Chinesehamster cell line [CHO]. After large-scale production of the cellscarrying the engineered chromosome, the chromosome is isolated. Inparticular, metaphase chromosomes are obtained, such as by addition ofcolchicine, and they are purified from the cell lysate. Thesechromosomes are used for cloning, sequencing and for delivery ofheterologous DNA into cells.

[0020] Also provided are SATACs of various sizes that are formed byrepeated culturing under selective conditions and subcloning of cellsthat contain chromosomes produced from the formerly dicentricchromosomes. The exemplified SATACs are based on repeating DNA unitsthat are about 15 Mb [two ˜7.5 Mb blocks]. The repeating DNA unit ofSATACs formed from other species and other chromosomes may vary, buttypically would be on the order of about 7 to about 20 Mb. The repeatingDNA units are referred to herein as megareplicons, which in theexemplified SATACs contain tandem blocks of satellite DNA flanked bynon-satellite DNA, including heterologous DNA and non-satellite DNA.Amplification produces an array of chromosome segments [each called anamplicon] that contain two inverted megareplicons bordered byheterologous [“foreign”] DNA. Repeated cell fusion, growth on selectivemedium and/or BrdU [5-bromodeoxyuridine] treatment or other treatmentwith other genome destabilizing reagent or agent, such as ionizingradiation, including X-rays, and subcloning results in cell lines thatcarry stable heterochromatic or partially heterochromatic chromosomes,including a 150-200 Mb “sausage” chromosome, a 500-1000 Mbgigachromosome, a stable 250-400 Mb megachromosome and various smallerstable chromosomes derived therefrom. These chromosomes are based onthese repeating units and can include heterologous DNA that isexpressed.

[0021] Thus, methods for producing MACs of both types (i.e., SATACS andminichromosomes) are provided. These methods are applicable to theproduction of artificial chromosomes containing centromeres derived fromany higher eukaryotic cell, including mammals, birds, fowl, fish,insects and plants.

[0022] The resulting chromosomes can be purified by methods providedherein to provide vectors for introduction of heterologous DNA intoselected cells for production of the gene product(s) encoded by theheterologous DNA, for production of transgenic (non-human) animals,birds, fowl, fish and plants or for gene therapy.

[0023] In addition, methods and vectors for fragmenting theminichromosomes and SATACs are provided. Such methods and vectors can beused for in vivo generation of smaller stable artificial chromosomes.Vectors for chromosome fragmentation are used to produce an artificialchromosome that contains a megareplicon, a centromere and two telomeresand will be between about 7.5 Mb and about 60 Mb, preferably betweenabout 10 Mb-15 Mb and 30-50 Mb. As exemplified herein, the preferredrange is between about 7.5 Mb and 50 Mb. Such artificial chromosomes mayalso be produced by other methods.

[0024] Isolation of the 15 Mb [or 30 Mb amplicon containing two 15 Mbinverted repeats] or a 30 Mb or higher multimer, such as 60 Mb, thereofshould provide a stable chromosomal vector that can be manipulated invitro. Methods for reducing the size of the MACs to generate smallerstable self-replicating artificial chromosomes are also provided.

[0025] Also provided herein, are methods for producing mammalianartificial chromosomes, including those provided herein, In vitro, andthe resulting chromosomes. The methods involve in vitro assembly of thestructural and functional elements to provide a stable artificialchromosome. Such elements include a centromere, two telomeres, at leastone origin of replication and filler heterochromatin, e.g., satelliteDNA. A selectable marker for subsequent selection is also generallyincluded. These specific DNA elements may be obtained from theartificial chromosomes provided herein such as those that have beengenerated by the introduction of heterologous DNA into cells and thesubsequent amplification that leads to the artificial chromosome,particularly the SATACs. Centromere sequences for use in the in vitroconstruction of artificial chromosomes may also be obtained by employingthe centromere cloning methods provided herein. In preferredembodiments, the sequences providing the origin of replication, inparticular, the megareplicator, are derived from rDNA. These sequencespreferably include the rDNA origin of replication and amplificationpromoting sequences.

[0026] Methods and vectors for targeting heterologous DNA into theartificial chromosomes are also provided as are methods and vectors forfragmenting the chromosomes to produce smaller but stable andself-replicating artificial chromosomes.

[0027] The chromosomes are introduced into cells to produce stabletransformed cell lines or cells, depending upon the source of the cells.Introduction is effected by any suitable method including, but notlimited to electroporation, direct uptake, such as by calcium phosphateprecipitation, uptake of isolated chromosomes by lipofection, bymicrocell fusion, by lipid-mediated carrier systems or other suitablemethod. The resulting cells can be used for production of proteins inthe cells. The chromosomes can be isolated and used for gene delivery.Methods for isolation of the chromosomes based on the DNA content of thechromosomes, which differs in MACs versus the authentic chromosomes, areprovided. Also provided are methods that rely on content, particularlydensity, and size of the MACs.

[0028] These artificial chromosomes can be used in gene therapy, geneproduct production systems, production of humanized geneticallytransformed animal organs, production of transgenic plants and animals(non-human), including mammals, birds, fowl, fish, invertebrates,vertebrates, reptiles and insects, any organism or device that wouldemploy chromosomal elements as information storage vehicles, and alsofor analysis and study of centromere function, for the production ofartificial chromosome vectors that can be constructed in vitro, and forthe preparation of species-specific artificial chromosomes. Theartificial chromosomes can be introduced into cells usingmicroinjection, cell fusion, microcell fusion, electroporation, nucleartransfer, electrofusion, projectile bombardment, nuclear transfer,calcium phosphate precipitation, lipid-mediated transfer systems andother such methods. Cells particularly suited for use with theartificial chromosomes include, but are not limited to plant cells,particularly tomato, arabidopsis, and others, insect cells, includingsilk worm cells, insect larvae, fish, reptiles, amphibians, arachnids,mammalian cells, avian cells, embryonic stem cells, haematopoietic stemcells, embryos and cells for use in methods of genetic therapy, such aslymphocytes that are used in methods of adoptive immunotherapy and nerveor neural cells. Thus methods of producing gene products and transgenic(non-human) animals and plants are provided. Also provided are theresulting transgenic animals and plants.

[0029] Exemplary cell lines that contain these chromosomes are alsoprovided.

[0030] Methods for preparing artificial chromosomes for particularspecies and for cloning centromeres are also provided. For example, twoexemplary methods provided for generating artificial chromosomes for usein different species are as follows. First, the methods herein may beapplied to different species. Second, means for generatingspecies-specific artificial chromosomes and for cloning centromeres areprovided. In particular, a method for cloning a centromere from ananimal or plant is provided by preparing a library of DNA fragments thatcontain the genome of the plant or animal and introducing each of thefragments into a mammalian satellite artificial chromosome [SATAC] thatcontains a centromere from a species, generally a mammal, different fromthe selected plant or animal, generally a non-mammal, and a selectablemarker. The selected plant or animal is one in which the mammalianspecies centromere does not function. Each of the SATACs is introducedinto the cells, which are grown under selective conditions, and cellswith SATACs are identified. Such SATACS should contain a centromereencoded by the DNA from the library or should contain the necessaryelements for stable replication in the selected species.

[0031] Also provided are libraries in which the relatively largefragments of DNA are contained on artificial chromosomes.

[0032] Transgenic (non-human) animals, invertebrates and vertebrates,plants and insects, fish, reptiles, amphibians, arachnids, birds, fowl,and mammals are also provided. Of particular interest are transgenic(non-human) animals and plants that express genes that confer resistanceor reduce susceptibility to disease. For example, the transgene mayencode a protein that is toxic to a pathogen, such as a virus, bacteriumor pest, but that is not toxic to the transgenic host. Furthermore,since multiple genes can be introduced on a MAC, a series of genesencoding an antigen can be introduced, which upon expression will serveto immunize [in a manner similar to a multivalent vaccine] the hostanimal against the diseases for which exposure to the antigens provideimmunity or some protection.

[0033] Also of interest are transgenic (non-human) animals that serve asmodels of certain diseases and disorders for use in studying the diseaseand developing therapeutic treatments and cures thereof. Such animalmodels of disease express genes [typically carrying a disease-associatedmutation], which are introduced into the animal on a MAC and whichinduce the disease or disorder in the animal. Similarly, MACs carryinggenes encoding antisense RNA may be introduced into animal cells togenerate conditional “knock-out” transgenic (non-human) animals. In suchanimals, expression of the antisense RNA results in decreased orcomplete elimination of the products of genes corresponding to theantisense RNA. Of further interest are transgenic mammals that harborMAC-carried genes encoding therapeutic proteins that are expressed inthe animal's milk. Transgenic (non-human) animals for use inxenotransplantation, which express MAC-carried genes that serve tohumanize the animal's organs, are also of interest. Genes that might beused in humanizing animal organs include those encoding human surfaceantigens.

[0034] Methods for cloning centromeres, such as mammalian centromeres,are also provided. In particular, in one embodiment, a library composedof fragments of SATACs are cloned into YACs [yeast artificialchromosomes] that include a detectable marker, such as DNA encodingtyrosinase, and then introduced into mammalian cells, such as albinomouse embryos. Mice produced from embryos containing such YACs thatinclude a centromere that functions in mammals will express thedetectable marker. Thus, if mice are produced from albino mouse embryosinto which a functional mammalian centromere was introduced, the micewill be pigmented or have regions of pigmentation.

[0035] A method for producing repeated tandem arrays of DNA is provided.This method, exemplified herein using telomeric DNA, is applicable toany repeat sequence, and in particular, low complexity repeats. Themethod provided herein for synthesis of arrays of tandem DNA repeats arebased in a series of extension steps in which successive doublings of asequence of repeats results in an exponential expansion of the array oftandem repeats. An embodiment of the method of synthesizing DNAfragments containing tandem repeats may generally be described asfollows. Two oligonucleotides are used as starting materials.Oligonucleotide 1 is of length k of repeated sequence (the flanks ofwhich are not relevant) and contains a relatively short stretch (60-90nucleotides) of the repeated sequence, flanked with appropriately chosenrestriction sites:

[0036] 5′-S1>>>>>>>>>>>>>>>>>>>>>>>>>>>S2 _-3′

[0037] where S1 is restriction site 1 cleaved by E1, S2 is a secondrestriction site cleaved by E2>represents a simple repeat unit, and ‘_’denotes a short (8-10) nucleotide flanking sequence complementary tooligonucleotide 2:

[0038]3′-_S3-5′

[0039] where S3 is a third restriction site for enzyme E3 and which ispresent in the vector to be used during the construction. The methodinvolves the following steps: (1) oligonucleotides 1 and 2 are annealed;(2) the annealed oligonucleotides are filled-in to produce adouble-stranded (ds) sequence; (3) the double-stranded DNA is cleavedwith restriction enzymes E1 and E3 and subsequently ligated into avector (e.g, pUC19 or a yeast vector) that has been cleaved with thesame enzymes E1 and E3; (4) the insert is isolated from a first portionof the plasmid by digesting with restriction enzymes E1 and E3, and asecond portion of the plasmid is cut with enzymes E2 (treated to removethe 3′-overhang) and E3, and the large fragment (plasmid DNA plus theinsert) is isolated; (5) the two DNA fragments (the S1-S3 insertfragment and the vector plus insert) are ligated; and (6) steps 4 and 5are repeated as many times as needed to achieve the desired repeatsequence size. In each extension cycle, the repeat sequence sizedoubles, i.e., if m is the number of extension cycles, the size of therepeat sequence will be k×2^(m) nucleotides.

DESCRIPTION OF THE DRAWINGS

[0040]FIG. 1 is a schematic drawing depicting formation of the MMCneo[the minichromosome] chromosome. A-G represents the successive eventsconsistent with observed data that would lead to the formation andstabilization of the minichromosome.

[0041]FIG. 2 shows a schematic summary of the manner in which theobserved new chromosomes would form, and the relationships among thedifferent de novo formed chromosomes. In particular, this figure shows aschematic drawing of the de novo chromosome formation initiated in thecentromeric region of mouse chromosome 7. (A) A single E-typeamplification in the centromeric region of chromosome 7 generates aneo-centromere linked to the integrated “foreign” DNA, and forms adicentric chromosome. Multiple E-type amplification forms the λneo-chromosome, which separates from the remainder of mouse chromosome 7through a specific breakage between the centromeres of the dicentricchromosome and which was stabilized in a mouse-hamster hybrid cell line;(B) Specific breakage between the centromeres of a dicentric chromosome7 generates a chromosome fragment with the neo-centromere, and achromosome 7 with traces of heterologous DNA at the end; (C) Invertedduplication of the fragment bearing the neo-centromere results in theformation of a stable neo-minichromosome; (D) Integration of exogenousDNA into the heterologous DNA region of the formerly dicentricchromosome 7 initiates H-type amplification, and the formation of aheterochromatic arm. By capturing a euchromatic terminal segment, thisnew chromosome arm is stabilized in the form of the “sausage”chromosome; (E) BrdU [5-bromodeoxyuridine] treatment and/or drugselection induce further H-type amplification, which results in theformation of an unstable gigachromosome: (F) Repeated BrdU treatmentsand/or drug selection induce further H-type amplification including acentromere duplication, which leads to the formation of anotherheterochromatic chromosome arm. It is split off from the chromosome 7 bychromosome breakage, and by acquiring a terminal segment, the stablemegachromosome is formed.

[0042]FIG. 3 is a schematic diagram of the replicon structure and ascheme by which a megachromosome could be produced.

[0043]FIG. 4 sets forth the relationships among some of the exemplarycell lines described herein.

[0044]FIG. 5 is a diagram of the plasmid pTEMPUD.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0045] Definitions

[0046] Unless defined otherwise, all technical and scientific terms usedherein have the same meaning as is commonly understood by one of skillin the art to which this invention belongs. All patents and publicationsreferred to herein are incorporated by reference.

[0047] As used herein, a mammalian artificial chromosome [MAC] is apiece of DNA that can stably replicate and segregate alongsideendogenous chromosomes. It has the capacity to accommodate and expressheterologous genes inserted therein. It is referred to as a mammalianartificial chromosome because it includes an active mammaliancentromere(s). Plant artificial chromosomes, insect artificialchromosomes and avian artificial chromosomes refer to chromosomes thatinclude plant and insect centromeres, respectively. A human artificialchromosome [HAC] refers to chromosomes that include human centromeres,BUGACs refer to insect artificial chromosomes, and AVACs refer to avianartificial chromosomes. Among the MACs provided herein are SATACs,minichromosomes, and in vitro synthesized artificial chromosomes.Methods for construction of each type are provided herein.

[0048] As used herein, in vitro synthesized artificial chromosomes areartificial chromosomes that is produced by joining the essentialcomponents (at least the centromere, and origins of replication) invitro.

[0049] As used herein, endogenous chromosomes refer to genomicchromosomes as found in the cell prior to generation or introduction ofa MAC.

[0050] As used herein, stable maintenance of chromosomes occurs when atleast about 85%, preferably 90%, more preferably 95%, of the cellsretain the chromosome. Stability is measured in the presence of aselective agent. Preferably these chromosomes are also maintained in theabsence of a selective agent. Stable chromosomes also retain theirstructure during cell culturing, suffering neither intrachromosomal norinterchromosomal rearrangements.

[0051] As used herein, growth under selective conditions means growth ofa cell under conditions that require expression of a selectable markerfor survival.

[0052] As used herein, an agent that destabilizes a chromosome is anyagent known by those of skill in the art to enhance amplificationevents, mutations. Such agents, which include BrdU, are well known tothose of skill in the art.

[0053] As used herein, de novo with reference to a centromere, refers togeneration of an excess centromere as a result of incorporation of aheterologous DNA fragment using the methods herein.

[0054] As used herein, euchromatin and heterochromatin have theirrecognized meanings, euchromatin refers to chromatin that stainsdiffusely and that typically contains genes, and heterochromatin refersto chromatin that remains unusually condensed and that has been thoughtto be transcriptionally inactive. Highly repetitive DNA sequences[satellite DNA], at least with respect to mammalian cells, are usuallylocated in regions of the heterochromatin surrounding the centromere[pericentric heterochromatin]. Constitutive heterochromatin refers toheterochromatin that contains the highly repetitive DNA which isconstitutively condensed and genetically inactive.

[0055] As used herein, BrdU refers to 5-bromodeoxyuridine, which duringreplication is inserted in place of thymidine. BrdU is used as amutagen; it also inhibits condensation of metaphase chromosomes duringcell division.

[0056] As used herein, a dicentric chromosome is a chromosome thatcontains two centromeres. A multicentric chromosome contains more thantwo centromeres.

[0057] As used herein, a formerly dicentric chromosome is a chromosomethat is produced when a dicentric chromosome fragments and acquires newtelomeres so that two chromosomes, each having one of the centromeres,are produced. Each of the fragments are replicable chromosomes. If oneof the chromosomes undergoes amplification of euchromatic DNA to producea fully functional chromosome that contains the newly introducedheterologous DNA and primarily [at least more than 50%] euchromatin, itis a minichromosome. The remaining chromosome is a formerly dicentricchromosome. If one of the chromosomes undergoes amplification, wherebyheterochromatin [satellite DNA] is amplified and a euchromatic portion[or arm] remains, it is referred to as a sausage chromosome. Achromosome that is substantially all heterochromatin, except forportions of heterologous DNA, is called a SATAC. Such chromosomes[SATACs] can be produced from sausage chromosomes by culturing the cellcontaining the sausage chromosome under conditions, such as BrdUtreatment and/or growth under selective conditions, that destabilize thechromosome so that a satellite artificial chromosomes [SATAC] isproduced. For purposes herein, it is understood that SATACs may notnecessarily be produced in multiple steps, but may appear after theinitial introduction of the heterologous DNA and growth under selectiveconditions, or they may appear after several cycles of growth underselective conditions and BrdU treatment.

[0058] As used herein, a SATAC refers to a chromosome that issubstantially all heterochromatin, except for portions of heterologousDNA. Typically, SATACs are satellite DNA based artificial chromosomes,but the term encompasses any chromosome made by the methods herein thatcontains more heterochromatin than euchromatin.

[0059] As used herein, amplifiable, when used in reference to achromosome, particularly the method of generating SATACs providedherein, refers to a region of a chromosome that is prone toamplification. Amplification typically occurs during replication andother cellular events involving recombination. Such regions aretypically regions of the chromosome that include tandem repeats, such assatellite DNA, rDNA and other such sequences.

[0060] As used herein, amplification, with reference to DNA, is aprocess in which segments of DNA are duplicated to yield two or multiplecopies of identical or nearly identical DNA segments that are typicallyjoined as substantially tandem or successive repeats or invertedrepeats.

[0061] As used herein an amplicon is a repeated DNA amplification unitthat contains a set of inverted repeats of the megareplicon. Amegareplicon represents a higher order replication unit. For example,with reference to the SATACs, the megareplicon contains a set of tandemDNA blocks each containing satellite DNA flanked by non-satellite DNA.Contained within the megareplicon is a primary replication site,referred to as the megareplicator, which may be involved in organizingand facilitating replication of the pericentric heterochromatin andpossibly the centromeres. Within the megareplicon there may be smaller[e.g., 50-300 kb in some mammalian cells] secondary replicons. In theexemplified SATACS, the megareplicon is defined by two tandem ˜7.5 MbDNA blocks [see, e.g., FIG. 3]. Within each artificial chromosome [AC]or among a population thereof, each amplicon has the same grossstructure but may contain sequence variations. Such variations willarise as a result of movement of mobile genetic elements, deletions orinsertions or mutations that arise, particularly in culture. Suchvariation does not affect the use of the ACs or their overall structureas described herein.

[0062] As used herein, ribosomal RNA [rRNA] is the specialized RNA thatforms part of the structure of a ribosome and participates in thesynthesis of proteins. Ribosomal RNA is produced by transcription ofgenes which, in eukaryotic cells, are present in multiple copies. Inhuman cells, the approximately 250 copies of rRNA genes per haploidgenome are spread out in clusters on at least five different chromosomes(chromosomes 13, 14, 15, 21 and 22). In mouse cells, the presence ofribosomal DNA [rDNA] has been verified on at least 11 pairs out of 20mouse chromosomes [chromosomes 5, 6, 9, 11, 12, 15, 16, 17, 18, 19 andX][see e.g., Rowe et al. (1996) Mamm. Genome 7:886-889 and Johnson etal. (1993) Mamm. Genome 4:49-52]. In eukaryotic cells, the multiplecopies of the highly conserved rRNA genes are located in a tandemlyarranged series of rDNA units, which are generally about 40-45 kb inlength and contain a transcribed region and a nontranscribed regionknown as spacer (i.e., intergenic spacer) DNA which can vary in lengthand sequence. In the human and mouse, these tandem arrays of rDNA unitsare located adjacent to the pericentric satellite DNA sequences(heterochromatin). The regions of these chromosomes in which the rDNA islocated are referred to as nucleolar organizing regions (NOR) which loopinto the nucleolus, the site of ribosome production within the cellnucleus.

[0063] As used herein, the minichromosome refers to a chromosome derivedfrom a multicentric, typically dicentric, chromosome [see, e.g., FIG. 1]that contains more euchromatic than heterochromatic DNA.

[0064] As used herein, a megachromosome refers to a chromosome that,except for introduced heterologous DNA, is substantially composed ofheterochromatin. Megachromosomes are made of an array of repeatedamplicons that contain two inverted megareplicons bordered by introducedheterologous DNA [see, e.g., FIG. 3 for a schematic drawing of amegachromosome]. For purposes herein, a megachromosome is about 50 to400 Mb, generally about 250-400 Mb. Shorter variants are also referredto as truncated megachromosomes [about 90 to 120 or 150 Mb], dwarfmegachromosomes [˜150-200 Mb] and cell lines, and a micro-megachromosome[˜50-90 Mb, typically 50-60 Mb]. For purposes herein, the termmegachromosome refers to the overall repeated structure based on anarray of repeated chromosomal segments [amplicons] that contain twoinverted megareplicons bordered by any inserted heterologous DNA. Thesize will be specified.

[0065] As used herein, genetic therapy involves the transfer orinsertion of heterologous DNA into certain cells, target cells, toproduce specific gene products that are involved in correcting ormodulating disease. The DNA is introduced into the selected target cellsin a manner such that the heterologous DNA is expressed and a productencoded thereby is produced. Alternatively, the heterologous DNA may insome manner mediate expression of DNA that encodes the therapeuticproduct. It may encode a product, such as a peptide or RNA, that in somemanner mediates, directly or indirectly, expression of a therapeuticproduct. Genetic therapy may also be used to introduce therapeuticcompounds, such as TNF, that are not normally produced in the host orthat are not produced in therapeutically effective amounts or at atherapeutically useful time. Expression of the heterologous DNA by thetarget cells within an organism afflicted with the disease therebyenables modulation of the disease. The heterologous DNA encoding thetherapeutic product may be modified prior to introduction into the cellsof the afflicted host in order to enhance or otherwise alter the productor expression thereof.

[0066] As used herein, heterologous or foreign DNA and RNA are usedinterchangeably and refer to DNA or RNA that does not occur naturally aspart of the genome in which it is present or which is found in alocation or locations in the genome that differ from that in which itoccurs in nature. It is DNA or RNA that is not endogenous to the celland has been exogenously introduced into the cell. Examples ofheterologous DNA include, but are not limited to, DNA that encodes agene product or gene product(s) of interest, introduced for purposes ofgene therapy or for production of an encoded protein. Other examples ofheterologous DNA include, but are not limited to, DNA that encodestraceable marker proteins, such as a protein that confers drugresistance, DNA that encodes therapeutically effective substances, suchas anti-cancer agents, enzymes and hormones, and DNA that encodes othertypes of proteins, such as antibodies. Antibodies that are encoded byheterologous DNA may be secreted or expressed on the surface of the cellin which the heterologous DNA has been introduced.

[0067] As used herein, a therapeutically effective product is a productthat is encoded by heterologous DNA that, upon introduction of the DNAinto a host, a product is expressed that effectively ameliorates oreliminates the symptoms, manifestations of an inherited or acquireddisease or that cures said disease.

[0068] As used herein, transgenic plants refer to plants in whichheterologous or foreign DNA is expressed or in which the expression of agene naturally present in the plant has been altered.

[0069] As used herein, operative linkage of heterologous DNA toregulatory and effector sequences of nucleotides, such as promoters,enhancers, transcriptional and translational stop sites, and othersignal sequences refers to the relationship between such DNA and suchsequences of nucleotides. For example, operative linkage of heterologousDNA to a promoter refers to the physical relationship between the DNAand the promoter such that the transcription of such DNA is initiatedfrom the promoter by an RNA polymerase that specifically recognizes,binds to and transcribes the DNA in reading frame. Preferred promotersinclude tissue specific promoters, such as mammary gland specificpromoters, viral promoters, such TK, CMV, adenovirus promoters, andother promoters known to those of skill in the art.

[0070] As used herein, isolated, substantially pure DNA refers to DNAfragments purified according to standard techniques employed by thoseskilled in the art, such as that found in Maniatis et al. [(1 982)Molecular Cloning: A Laboratory Manual, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.].

[0071] As used herein, expression refers to the process by which nucleicacid is transcribed into mRNA and translated into peptides,polypeptides, or proteins. If the nucleic acid is derived from genomicDNA, expression may, if an appropriate eukaryotic host cell or organismis selected, include splicing of the mRNA.

[0072] As used herein, vector or plasmid refers to discrete elementsthat are used to introduce heterologous DNA into cells for eitherexpression of the heterologous DNA or for replication of the clonedheterologous DNA. Selection and use of such vectors and plasmids arewell within the level of skill of the art.

[0073] As used herein, transformation/transfection refers to the processby which DNA or RNA is introduced into cells. Transfection refers to thetaking up of exogenous nucleic acid, e.g., an expression vector, by ahost cell whether or not any coding sequences are in fact expressed.Numerous methods of transfection are known to the ordinarily skilledartisan, for example, by direct uptake using calcium phosphate [CaPO4;see, e.g., Wigler et al. (1979) Proc. Natl. Acad. Sci. U.S.A.76:1373-1376], polyethylene glycol [PEG]-mediated DNA uptake,electroporation, lipofection [see, e.g., Strauss (1996) Meth. Mol. Biol.54:307-327], microcell fusion [see, EXAMPLES, see, also Lambert (1991)Proc. Natl. Acad. Sci. U.S.A. 88:5907-5911; U.S. Pat. No. 5,396,767,Sawford et al. (1987) Somatic Cell Mol. Genet. 13:279-284; Dhar et al.(1984) Somatic Cell Mol. Genet. 10:547-559; and McNeill-Killary et al.(1995) Meth. Enzymol. 254:133-152], lipid-mediated carrier systems [see,e.g., Teifel et al. (1995) Biotechniques 19:79-80; Albrecht et al.(1996) Ann. Hematol. 72:73-79; Holmen et al. (1995) In Vitro Cell Dev.Biol. Anim. 31:347-351; REmy et al. (1994) Bioconjug. Chem. 5:647-654;Le Bolch et al. (1995) Tetrahedron Lett. 36:6681-6684; Loeffler et al.(1993) Meth. Enzymol. 217:599-618] or other suitable method. Successfultransfection is generally recognized by detection of the presence of theheterologous nucleic acid within the transfected cell, such as anyindication of the operation of a vector within the host cell.Transformation means introducing DNA into an organism so that the DNA isreplicable, either as an extrachromosomal element or by chromosomalintegration.

[0074] As used herein, injected refers to the microinjection [use of asmall syringe] of DNA into a cell.

[0075] As used herein, substantially homologous DNA refers to DNA thatincludes a sequence of nucleotides that is sufficiently similar toanother such sequence to form stable hybrids under specified conditions.

[0076] It is well known to those of skill in this art that nucleic acidfragments with different sequences may, under the same conditions,hybridize detectably to the same “target” nucleic acid. Two nucleic acidfragments hybridize detectably, under stringent conditions over asufficiently long hybridization period, because one fragment contains asegment of at least about 14 nucleotides in a sequence which iscomplementary [or nearly complementary] to the sequence of at least onesegment in the other nucleic acid fragment. If the time during whichhybridization is allowed to occur is held constant, at a value duringwhich, under preselected stringency conditions, two nucleic acidfragments with exactly complementary base-pairing segments hybridizedetectably to each other, departures from exact complementarity can beintroduced into the base-pairing segments, and base-pairing willnonetheless occur to an extent sufficient to make hybridizationdetectable. As the departure from complementarity between thebase-pairing segments of two nucleic acids becomes larger, and asconditions of the hybridization become more stringent, the probabilitydecreases that the two segments will hybridize detectably to each other.

[0077] Two single-stranded nucleic acid segments have “substantially thesame sequence,” within the meaning of the present specification, if (a)both form a base-paired duplex with the same segment, and (b) themelting temperatures of said two duplexes in a solution of 0.5×SSPEdiffer by less than 10° C. If the segments being compared have the samenumber of bases, then to have “substantially the same sequence”, theywill typically differ in their sequences at fewer than 1 base in 10.Methods for determining melting temperatures of nucleic acid duplexesare well known [see, e.g., Meinkoth and Wahl (1984) Anal. Biochem. 138:267-284 and references cited therein].

[0078] As used herein, a nucleic acid probe is a DNA or RNA fragmentthat includes a sufficient number of nucleotides to specificallyhybridize to DNA or RNA that includes identical or closely relatedsequences of nucleotides. A probe may contain any number of nucleotides,from as few as about 10 and as many as hundreds of thousands ofnucleotides. The conditions and protocols for such hybridizationreactions are well known to those of skill in the art as are the effectsof probe size, temperature, degree of mismatch, salt concentration andother parameters on the hybridization reaction. For example, the lowerthe temperature and higher the salt concentration at which thehybridization reaction is carried out, the greater the degree ofmismatch that may be present in the hybrid molecules.

[0079] To be used as a hybridization probe, the nucleic acid isgenerally rendered detectable by labelling it with a detectable moietyor label, such as ³²p, ³H and ¹⁴C, or by other means, including chemicallabelling, such as by nick-translation in the presence of deoxyuridylatebiotinylated at the 5′-position of the uracil moiety. The resultingprobe includes the biotinylated uridylate in place of thymidylateresidues and can be detected [via the biotin moieties] by any of anumber of commercially available detection systems based on binding ofstreptavidin to the biotin. Such commercially available detectionsystems can be obtained, for example, from Enzo Biochemicals, Inc. [NewYork, N.Y.]. Any other label known to those of skill in the art,including non-radioactive labels, may be used as long as it renders theprobes sufficiently detectable, which is a function of the sensitivityof the assay, the time available [for culturing cells, extracting DNA,and hybridization assays], the quantity of DNA or RNA available as asource of the probe, the particular label and the means used to detectthe label.

[0080] Once sequences with a sufficiently high degree of homology to theprobe are identified, they can readily be isolated by standardtechniques, which are described, for example, by Maniatis et al. ((1982)Molecular Cloning: A Laboratory Manual, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.).

[0081] As used herein, conditions under which DNA molecules form stablehybrids and are considered substantially homologous are such that DNAmolecules with at least about 60% complementarity form stable hybrids.Such DNA fragments are herein considered to be “substantiallyhomologous”. For example, DNA that encodes a particular protein issubstantially homologous to another DNA fragment if the DNA forms stablehybrids such that the sequences of the fragments are at least about 60%complementary and if a protein encoded by the DNA retains its activity.

[0082] For purposes herein, the following stringency conditions aredefined:

[0083] 1) high stringency: 0.1×SSPE, 0.1% SDS, 65° C.

[0084] 2) medium stringency: 0.2×SSPE, 0.1% SDS, 50° C.

[0085] 3) low stringency: 1.0×SSPE, 0.1% SDS, 50° C.

[0086] or any combination of salt and temperature and other reagentsthat result in selection of the same degree of mismatch or matching.

[0087] As used herein, immunoprotective refers to the ability of avaccine or exposure to an antigen or immunity-inducing agent, to conferupon a host to whom the vaccine or antigen is administered orintroduced, the ability to resist infection by a disease-causingpathogen or to have reduced symptoms. The selected antigen is typicallyan antigen that is presented by the pathogen.

[0088] As used herein, all assays and procedures, such as hybridizationreactions and antibody-antigen reactions, unless otherwise specified,are conducted under conditions recognized by those of skill in the artas standard conditions.

[0089] A. Preparation of Cell Lines Containing MACs

[0090] 1. The Megareplicon

[0091] The methods, cells and MACs provided herein are produced byvirtue of the discovery of the existence of a higher-order replicationunit [megareplicon] of the centromeric region. This megareplicon isdelimited by a primary replication initiation site [megareplicator], andappears to facilitate replication of the centromeric heterochromatin,and most likely, centromeres. Integration of heterologous DNA into themegareplicator region or in close proximity thereto, initiates alarge-scale amplification of megabase-size chromosomal segments, whichleads to de novo chromosome formation in living cells.

[0092] DNA sequences that provide a preferred megareplicator are therDNA units that give rise to ribosomal RNA (rRNA). In mammals,particularly mice and humans, these rDNA units contain specializedelements, such as the origin of replication (or origin of bidirectionalreplication, i.e., OBR, in mouse) and amplification promoting sequences(APS) and amplification control elements (ACE) (see, e.g., Gogel et al.(1996) Chromosoma 104:511-518; Coffman et al. (1993) Exp. Cell. Res.209:123-132; Little et al. (1993) Mol. Cell. Biol. 13:6600-6613; Yoon etal. (1995) Mol. Cell. Biol. 15:2482-2489; Gonzalez and Sylvester (1995)Genomics 27:320-328; Miesfeld and Arnheim (1982) Nuc. Acids Res.10:3933-3949]); Maden et al. (1987) Biochem. J. 246:519-527).

[0093] As described herein, without being bound by any theory, thesespecialized elements may facilitate replication and/or amplification ofmegabase-size chromosomal segments in the de novo formation ofchromosomes, such as those described herein, in cells. These specializedelements are typically located in the nontranscribed intergenic spacerregion upstream of the transcribed region of rDNA. The intergenic spacerregion may itself contain internally repeated sequences which can beclassified as tandemly repeated blocks and nontandem blocks (see e.g.,Gonzalez and Sylvester (1995) Genomics 27:320-328). In mouse rDNA, anorigin of bidirectional replication may be found within a 3-kbinitiation zone centered approximately 1.6 kb upstream of thetranscription start site (see, e.g., Gogel et al. (1996) Chromosoma104:511-518). The sequences of these specialized elements tend to havean altered chromatin structure, which may be detected, for example, bynuclease hypersensitivity or the presence of AT-rich regions that cangive rise to bent DNA structures. An exemplary sequence encompassing anorigin of replication is shown in SEQ ID NO. 16 and in GENBANK accessionno. X82564 at about positions 2430-5435. Exemplary sequencesencompassing amplification-promoting sequences include nucleotides690-1060 and 1105-1530 of SEQ ID NO. 16.

[0094] In human rDNA, a primary replication initiation site may be founda few kilobase pairs upstream of the transcribed region and secondaryinitiation sites may be found throughout the nontranscribed intergenicspacer region (see, e.g., Yoon et al. (1995) Mol. Cell. Biol.15:2482-2489). A complete human rDNA repeat unit is presented in GENBANKas accession no. U13369 and is set forth in SEQ ID NO. 17 herein.Another exemplary sequence encompassing a replication initiation sitemay be found within the sequence of nucleotides 35355-42486 in SEQ IDNO. 17 particularly within the sequence of nucleotides 37912-42486 andmore particularly within the sequence of nucleotides 37912-39288 of SEQID NO. 17 (see Coffman et al. (1993) Exp. Cell. Res. 209:123-132).

[0095] Cell lines containing MACs can be prepared by transforming cells,preferably a stable cell line, with a heterologous DNA fragment thatencodes a selectable marker, culturing under selective conditions, andidentifying cells that have a multicentric, typically dicentric,chromosome. These cells can then be manipulated as described herein toproduce the minichromosomes and other MACs, particularly theheterochromatic SATACs, as described herein.

[0096] Development of a multicentric, particularly dicentric, chromosometypically is effected through integration of the heterologous DNA in thepericentric heterochromatin, preferably in the centromeric regions ofchromosomes carrying rDNA sequences. Thus, the frequency ofincorporation can be increased by targeting to these regions, such as byincluding DNA, including, but not limited to, rDNA or satellite DNA, inthe heterologous fragment that encodes the selectable marker. Among thepreferred targeting sequences for directing the heterologous DNA to thepericentromeric heterochromatin are rDNA sequences that targetcentromeric regions of chromosomes that carry rRNA genes. Such sequencesinclude, but are not limited to, the DNA of SEQ ID NO. 16 and GENBANKaccession no. X82564 and portions thereof, the DNA of SEQ ID NO. 17 andGENBANK accession no. U13369 and portions thereof, and the DNA of SEQ IDNOS. 18-24. A particular vector incorporating from within SEQ ID NO. 16for use in directing integration of heterologous DNA into chromosomalrDNA is pTERPUD (see Example 12). Satellite DNA sequences can also beused to direct the heterologous DNA to integrate into the pericentricheterochromatin. For example, vectors pTEMPUD and pHASPUD, which containmouse and human satellite DNA, respectively, are provided herein (seeExample 12) as exemplary vectors for introduction of heterologous DNAinto cells for de novo artificial chromosome formation.

[0097] The resulting cell lines can then be treated as the exemplifiedcells herein to produce cells in which the dicentric chromosome hasfragmented. The cells can then be used to introduce additional selectivemarkers into the fragmented dicentric chromosome (i.e., formerlydicentric chromosome), whereby amplification of the pericentricheterochromatin will produce the heterochromatic chromosomes.

[0098] The following discussion describes this process with reference tothe EC3/7 line and the resulting cells. The same procedures can beapplied to any other cells, particularly cell lines to create SATACs andeuchromatic minichromosomes.

[0099] 2. Formation of De novo Chromosomes

[0100] De novo centromere formation in a transformed mouseLMTK-fibro-blast cell line [EC3/7] after cointegration of λ constructs[λCM8 and λgtWESneo] carrying human and bacterial DNA [Hadlaczky et al.(1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-8110 and U.S. applicationSer. No. 08/375,271] has been shown. The integration of the“heterologous” engineered human, bacterial and phage DNA, and thesubsequent amplification of mouse and heterologous DNA that led to theformation of a dicentric chromosome, occurred at the centromeric regionof the short arm of a mouse chromosome. By G-banding, this chromosomewas identified as mouse chromosome 7. Because of the presence of twofunctionally active centromeres on the same chromosome, regularbreakages occur between the centromeres. Such specific chromosomebreakages gave rise to the appearance [in approximately 10% of thecells] of a chromosome fragment carrying the neo-centromere. From theEC3/7 cell line [see, U.S. Pat. No. 5,288,625, deposited at the EuropeanCollection of Animal Cell Culture (hereinafter ECACC) under accessionno. 90051001; see, also Hadlaczky et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:8106-8110, and U.S. application Ser. No. 08/375,271 and thecorresponding published European application EP 0 473 253, two sublines[EC3/7C5 and EC3/7C6] were selected by repeated single-cell cloning. Inthese cell lines, the neo-centromere was found exclusively on aminichromosome [neo-minichromosome], while the formerly dicentricchromosome carried traces of “heterologous” DNA.

[0101] It has now been discovered that integration of DNA encoding aselectable marker in the heterochromatic region of the centromere led toformation of the dicentric chromosome.

[0102] 3. The Neo-minichromosome

[0103] The chromosome breakage in the EC3/7 cells, which separates theneo-centromere from the mouse chromosome, occurred in the G-bandpositive “heterologous” DNA region. This is supported by the observationof traces of λ and human DNA sequences at the broken end of the formerlydicentric chromosome. Comparing the G-band pattern of the chromosomefragment carrying the neo-centromere with that of the stableneo-minichromosome, it is apparent that the neo-minichromosome is aninverted duplicate of the chromosome fragment that bears theneo-centromere. This is supported by the observation that although theneo-minichromosome carries only one functional centromere, both ends ofthe minichromosome are heterochromatic, and mouse satellite DNAsequences were found in these heterochromatic regions by in situhybridization.

[0104] Mouse cells containing the minichromosome, which containsmultiple repeats of the heterologous DNA, which in the exemplifiedembodiment is λ DNA and the neomycin-resistance gene, can be used asrecipient cells in cell transformation. Donor DNA, such as selectedheterologous DNA containing λ DNA linked to a second selectable marker,such as the gene encoding hygromycin phosphotransferase which confershygromycin resistance [hyg], can be introduced into the mouse cells andintegrated into the minichromosomes by homologous recombination of λ DNAin the donor DNA with that in the minichromosomes. Integration isverified by in situ hybridization and Southern blot analyses.Transcription and translation of the heterologous DNA is confirmed byprimer extension and immunoblot analyses.

[0105] For example, DNA has been targeted into the neo-minichromosome inEC3/7C5 cells using a λ DNA-containing construct [pNem1ruc] that alsocontains DNA encoding hygromycin resistance and the Renilla luciferasegene linked to a promoter, such as the cytomegalovirus [CMV] earlypromoter, and the bacterial neomycin resistance-encoding DNA.Integration of the donor DNA into the chromosome in selected cells[designated PHN4] was confirmed by nucleic acid amplification [PCR] andin situ hybridization. Events that would produce a neo-minichromosomeare depicted in FIG. 1.

[0106] The resulting engineered minichromosome that contains theheterologous DNA can then be transferred by cell fusion into a recipientcell line, such as Chinese hamster ovary cells [CHO] and correctexpression of the heterologous DNA can be verified. Following productionof the cells, metaphase chromosomes are obtained, such as by addition ofcolchicine, and the chromosomes purified by addition of AT-andGC-specific dyes on a dual laser beam based cell sorter (see Example 10B for a description of methods of isolating artificial chromomsomes).Preparative amounts of chromosomes [5×10⁴-5×10⁷ chromosomes/ml] at apurity of 95% or higher can be obtained. The resulting chromosomes areused for delivery to cells by methods such as microinjection andliposome-mediated transfer.

[0107] Thus, the neo-minichromosome is stably maintained in cells,replicates autonomously, and permits the persistent long-term expressionof the neo gene under non-selective culture conditions. It also containsmegabases of heterologous known DNA [λ DNA in the exemplifiedembodiments] that serves as target sites for homologous recombinationand integration of DNA of interest. The neo-minichromosome is, thus, avector for genetic engineering of cells. It has been introduced intoSCID mice, and shown to replicate in the same manner as endogenouschromosomes.

[0108] The methods herein provide means to induce the events that leadto formation of the neo-minichromosome by introducing heterologous DNAwith a selective marker [preferably a dominant selectable marker] intocells and culturing the cells under selective conditions. As a result,cells that contain a multicentric, e.g., dicentric chromosome, orfragments thereof, generated by amplification are produced. Cells withthe dicentric chromosome can then be treated to destabilize thechromosomes with agents, such as BrdU and/or culturing under selectiveconditions, resulting in cells in which the dicentric chromosome hasformed two chromosomes, a so-called minichromosome, and a formerlydicentric chromosome that has typically undergone amplification in theheterochromatin where the heterologous DNA has integrated to produce aSATAC or a sausage chromosome [discussed below]. These cells can befused with other cells to separate the minichromosome from the formerlydicentric chromosome into different cells so that each type of MAC canbe manipulated separately.

[0109] 4. Preparation of SATACs

[0110] An exemplary protocol for preparation of SATACs is illustrated inFIG. 2 [particularly D, E and F] and FIG. 3 [see, also the EXAMPLES,particularly EXAMPLES 4-7].

[0111] To prepare a SATAC, the starting materials are cells, preferablya stable cell line, such as a fibroblast cell line, and a DNA fragmentthat includes DNA that encodes a selective marker. The DNA fragment isintroduced into the cell by methods of DNA transfer, including but notlimited to direct uptake using calcium phosphate, electroporation, andlipid-mediated transfer. To insure integration of the DNA fragment inthe heterochromatin, it is preferable to start with DNA that will betargeted to the pericentric heterochromatic region of the chromosome,such as λCM8 and vectors provided herein, such as pTEMPUD [FIG. 5] andpHASPUD (see Example 12) that include satellite DNA, or specificallyinto rDNA in the centromeric regions of chromosomes containing rDNAsequences. After introduction of the DNA, the cells are grown underselective conditions. The resulting cells are examined and any that havemulticentric, particularly dicentric, chromosomes [or heterochromaticchromosomes or sausage chromosomes or other such structure; see, FIG.2D, 2E and 2F] are selected.

[0112] In particular, if a cell with a dicentric chromosome is selected,it can be grown under selective conditions, or, preferably, additionalDNA encoding a second selectable marker is introduced, and the cellsgrown under conditions selective for the second marker. The resultingcells should include chromosomes that have structures similar to thosedepicted in FIGS. 2D, 2E, 2F. Cells with a structure, such as thesausage chromosome, FIG. 2D, can be selected and fused with a secondcell line to eliminate other chromosomes that are not of interest. Ifdesired, cells with other chromosomes can be selected and treated asdescribed herein. If a cell with a sausage chromosome is selected, itcan be treated with an agent, such as BrdU, that destabilizes thechromosome so that the heterochromatic arm forms a chromosome that issubstantially heterochromatic [i.e., a megachromosome, see, FIG. 2F].Structures such as the gigachromsome in which the heterochromatic armhas amplified but not broken off from the euchromatic arm, will also beobserved. The megachromosome is a stable chromosome. Furthermanipulation, such as fusions and growth in selective conditions and/orBrdU treatment or other such treatment, can lead to fragmentation of themegachromosome to form smaller chromosomes that have the amplicon as thebasic repeating unit.

[0113] The megachromosome can be further fragmented in vivo using achromosome fragmentation vector, such as pTEMPUD [see, FIG. 5 andEXAMPLE 12], pHASPUD or pTERPUD (see Example 12) to ultimately produce achromosome that comprises a smaller stable replicable unit, about 1 5Mb-60 Mb, containing one to four megareplicons.

[0114] Thus, the stable chromosomes formed de novo that originate fromthe short arm of mouse chromosome 7 have been analyzed. This chromosomeregion shows a capacity for amplification of large chromosome segments,and promotes de novo chromosome formation. Large-scale amplification atthe same chromosome region leads to the formation of dicentric andmulticentric chromosomes, a minichromosome, the 150-200 Mb size Aneo-chromosome, the “sausage” chromosome, the 500-1000 Mbgigachromosome, and the stable 250-400 Mb megachromosome.

[0115] A clear segmentation is observed along the arms of themegachromosome, and analyses show that the building units of thischromosome are amplicons of ˜30 Mb composed of mouse major satellite DNAwith the integrated “foreign” DNA sequences at both ends. The ˜30 Mbamplicons are composed of two ˜15 Mb inverted doublets of ˜7.5 Mb mousemajor satellite DNA blocks, which are separated from each other by anarrow band of non-satellite sequences [see, e.g., FIG. 3]. The widernon-satellite regions at the amplicon borders contain integrated,exogenous [heterologous] DNA, while the narrow bands of non-satelliteDNA sequences within the amplicons are integral parts of the pericentricheterochromatin of mouse chromosomes. These results indicate that the˜7.5 Mb blocks flanked by non-satellite DNA are the building units ofthe pericentric heterochromatin of mouse chromosomes, and the ˜15 Mbsize pericentric regions of mouse chromosomes contain two ˜7.5 Mb units.

[0116] Apart from the euchromatic terminal segments, the wholemegachromosome is heterochromatic, and has structural homogeneity.Therefore, this large chromosome offers a unique possibility forobtaining information about the amplification process, and for analyzingsome basic characteristics of the pericentric constitutiveheterochromatin, as a vector for heterologous DNA, and as a target forfurther fragmentation.

[0117] As shown herein, this phenomenon is generalizable and can beobserved with other chromosomes. Also, although these de novo formedchromosome segments and chromosomes appear different, there aresimilarities that indicate that a similar amplification mechanism playsa role in their formation: (i) in each case, the amplification isinitiated in the centromeric region of the mouse chromosomes and large(Mb size) amplicons are formed; (ii) mouse major satellite DNA sequencesare constant constituents of the amplicons, either by providing the bulkof the heterochromatic amplicons [H-type amplification], or by borderingthe aeuchromatic amplicons [E-type amplification]; (iii) formation ofinverted segments can be demonstrated in the λ neo-chromosome andmegachromosome; (iv) chromosome arms and chromosomes formed by theamplification are stable and functional.

[0118] The presence of inverted chromosome segments seems to be a commonphenomenon in the chromosomes formed de novo at the centromeric regionof mouse chromosome 7. During the formation of the neo-minichromosome,the event leading to the stabilization of the distal segment of mousechromosome 7 that bears the neo-centromere may have been the formationof its inverted duplicate. Amplicons of the megachromosome are inverteddoublets of ˜7.5 Mb mouse major satellite DNA blocks.

[0119] 5. Cell Lines

[0120] Cell lines that contain MACs, such as the minichromosome, theλ-neo chromosome, and the SATACs are provided herein or can be producedby the methods herein. Such cell lines provide a convenient source ofthese chromosomes and can be manipulated, such as by cell fusion orproduction of microcells for fusion with selected cell lines, to deliverthe chromosome of interest into hybrid cell lines. Exemplary cell linesare described herein and some have been deposited with the ECACC.

[0121] a. EC3/7C5 and EC3/7C6

[0122] Cell lines EC3/7C5 and EC3/7C6 were produced by single cellcloning of EC3/7. For exemplary purposes EC3/7C5 has been deposited withthe ECACC. These cell lines contain a minichromosome and the formerlydicentric chromosome from EC317. The stable minichromosomes in celllines EC3/7C5 and EC3/7C6 appear to be the same and they seem to beduplicated derivatives of the ˜10-15 Mb “broken-off ” fragment of thedicentric chromosome. Their similar size in these independentlygenerated cell lines might indicate that ˜20-30 Mb is the minimal orclose to the minimal physical size for a stable minichromosome.

[0123] b. TF1004G19

[0124] Introduction of additional heterologous DNA, including DNAencoding a second selectable marker, hygromycin phosphotransferase,i.e., the hygromycin-resistance gene, and also a detectable marker,β-galactosidase (i.e., encoded by the lacZ gene), into the EC3/7C5 cellline and growth under selective conditions produced cells designatedTF1004G19. In particular, this cell line was produced from the EC3/7C5cell line by cotransfection with plasmids pH 1 32, which contains ananti-HIV ribozyme and hygromycin-resistance gene, pCH 110 [encodesβ-galactosidase] and λ phage [λcl 875 Sam 7] DNA and selection withhygromycin B.

[0125] Detailed analysis of the TF 1004G19 cell line by in situhybridization with A phage and plasmid DNA sequences revealed theformation of the sausage chromosome. The formerly dicentric chromosomeof the EC3/7C5 cell line translocated to the end of another acrocentricchromosome. The heterologous DNA integrated into the pericentricheterochromatin of the formerly dicentric chromosome and is amplifiedseveral times with megabases of mouse pericentric heterochromaticsatellite DNA sequences [FIG. 2D] forming the “sausage” chromosome.Subsequently the acrocentric mouse chromosome was substituted by aeuchromatic telomere.

[0126] In situ hybridization with biotin-labeled subfragments of thehygromycin-resistance and β-galactosidase genes resulted in ahybridization signal only in the heterochromatic arm of the sausagechromosome, indicating that in TF1004G19 transformant cells these genesare localized in the pericentric heterochromatin.

[0127] A high level of gene expression, however, was detected. Ingeneral, heterochromatin has a silencing effect in Drosophila, yeast andon the HSV-tk gene introduced into satellite DNA at the mousecentromere. Thus, it was of interest to study the TF1004G19 transformedcell line to confirm that genes located in the heterochromatin wereindeed expressed, contrary to recognized dogma.

[0128] For this purpose, subclones of TF1004G19, containing a differentsausage chromosome [see FIG. 2D], were established by single cellcloning. Southern hybridization of DNA isolated from the subclones withsubfragments of hygromycin phosphotransferase and lacZ genes showed aclose correlation between the intensity of hybridization and the lengthof the sausage chromosome. This finding supports the conclusion thatthese genes are localized in the heterochromatic arm of the sausagechromosome.

[0129] (1) TF1004G-19C5

[0130] TF1004G-19C5 is a mouse LMTK-fibroblast cell line containingneo-minichromosomes and stable “sausage” chromosomes. It is a subcloneof TF1004G19 and was generated by single-cell cloning of the TF1004G19cell line. It has been deposited with the ECACC as an exemplary cellline and exemplary source of a sausage chromosome. Subsequent fusion ofthis cell line with CHO K20 cells and selection with hygromycin and G418and HAT (hypoxanthine, aminopteria, and thymidine medium; see Szybalskiet al. (1962) Proc. Natl. Acad. Sci. 48:2026) resulted in hybrid cells(designated 19C5xHa4) that carry the sausage chromosome and theneo-minichromosome. BrdU treatment of the hybrid cells, followed bysingle cell cloning and selection with G418 and/or hygromycin producedvarious cells that carry chromosomes of interest, including GB43 andG3D5.

[0131] (2) Other Subclones

[0132] Cell lines GB43 and G3D5 were obtained by treating 19C5xHa4 cellswith BrdU followed by growth in G41 8-containing selective medium andretreatment with BrdU. The two cell lines were isolated by single cellcloning of the selected cells. GB43 cells contain the neo-minichromosomeonly. G3D5, which has been deposited with the ECACC, carries theneo-minichromosome and the megachromosome. Single cell cloning of thiscell line followed by growth of the subclones in G418-andhygromycin-containing medium yielded subclones such as the GHB42 cellline carrying the neo-minichromosome and the megachromosome. H1D3 is amouse-hamster hybrid cell line carrying the megachromosome, but noneo-minichromosome, and was generated by treating 19C5xHa4 cells withBrdU followed by growth in hygromycin-containing selective medium andsingle cell subcloning of selected cells. Fusion of this cell line withthe CD4⁺ HeLa cell line that also carries DNA encoding an additionalselection gene, the neomycin-resistance gene, produced cells [designatedH1xHE41 cells] that carry the megachromosome as well as a humanchromosome that carries CD4neo. Further BrdU treatment and single cellcloning produced cell lines, such as 1 B3, that include cells with atruncated megachromosome.

[0133] 5. DNA Constructs used to Transform the Cells

[0134] Heterologous DNA can be introduced into the cells by transfectionor other suitable method at any stage during preparation of thechromosomes [see, e.g., FIG. 4]. In general, incorporation of such DNAinto the MACs is assured through site-directed integration, such as maybe accomplished by inclusion of λ-DNA in the heterologous DNA (for theexemplified chromosomes), and also an additional selective marker gene.For example, cells containing a MAC, such as the minichromosome or aSATAC, can be cotransfected with a plasmid carrying the desiredheterologous DNA, such as DNA encoding an HIV ribozyme, the cysticfibrosis gene, and DNA encoding a second selectable marker, such ashygromycin resistance. Selective pressure is then applied to the cellsby exposing them to an agent that is harmful to cells that do notexpress the new selectable marker. In this manner, cells that includethe heterologous DNA in the MAC are identified. Fusion with a secondcell line can provide a means to produce cell lines that contain oneparticular type of chromosomal structure or MAC.

[0135] Various vectors for this purpose are provided herein [see,Examples] and others can be readily constructed. The vectors preferablyinclude DNA that is homologous to DNA contained within a MAC in order totarget the DNA to the MAC for integration therein. The vectors alsoinclude a selectable marker gene and the selected heterologous gene(s)of interest. Based on the disclosure herein and the knowledge of theskilled artisan, one of skill can construct such vectors.

[0136] Of particular interest herein is the vector pTEMPUD andderivatives thereof that can target DNA into the heterochromatic regionof selected chromosomes. These vectors can also serve as fragmentationvectors [see, e.g, Example 12].

[0137] Heterologous genes of interest include any gene that encodes atherapeutic product and DNA encoding gene products of interest. Thesegenes and DNA include, but are not limited to: the cystic fibrosis gene[CF], the cystic fibrosis transmembrane regulator (CFTR) gene [see, em,U.S. Pat. No. 5,240,846; Rosenfeld et al. (1992) Cell 68:143-155; Hydeet al. (1993) Nature 362: 250-255; Kerem et al. (1989) Science245:1073-1080; Riordan et al.( 1989) Science 245:1066-1072; Rommens etal. (1989) Science 245:1059-1065; Osborne et al. (1991) Am. J. Hum.Genetics 48:6089-6122; White et al. (1990) Nature 344:665-667; Dean etal. (1990) Cell 61:863-870; Erlich et al. (1991) Science 252:1643; andU.S. Pat. Nos. 5,453,357, 5,449,604, 5,434,086, and 5,240,846, whichprovides a retroviral vector encoding the normal CFTR gene].

[0138] B. Isolation of Artificial Chromosomes

[0139] The MACs provided herein can be isolated by any suitable methodknown to those of skill in the art. Also, methods are provided hereinfor effecting substantial purification, particularly of the SATACs.SATACs have been isolated by fluorescence-activated cell sorting [FACS].This method takes advantage of the nucleotide base content of theSATACs, which, by virtue of their high heterochromatic DNA content, willdiffer from any other chromosomes in a cell. In particular embodiment,metaphase chromosomes are isolated and stained with base-specific dyes,such as Hoechst 33258 and chromomycin A3. Fluorescence-activated cellsorting will separate the SATACs from the endogenous chromosomes. Adual-laser cell sorter [FACS Vantage Becton Dickinson ImmunocytometrySystems] in which two lasers were set to excite the dyes separately,allowed a bivariate analysis of the chromosomes by base-pair compositionand size. Cells containing such SATACs can be similarly sorted.

[0140] Additional methods provided herein for isolation of artificialchromosomes from endogenous chromosomes include procedures that areparticularly well suited for large-scale isolation of artificialchromosomes such as SATACs. In these methods, the size and densitydifferences between SATACs and endogenous chromosomes are exploited toeffect separation of these two types of chromosomes. Such methodsinvolve techniques such as swinging bucket centrifugation, zonal rotorcentrifugation, and velocity sedimentation. Affinity-, particularlyimmunoaffinity-, based methods for separation of artificial fromendogenous chromosomes are also provided herein. For example, SATACs,which are predominantly heterochromatin, may be separated fromendogenous chromosomes through immunoaffinity procedures involvingantibodies that specifically recognize heterochromatin, and/or theproteins associated therewith, when the endogenous chromosomes containrelatively little heterochromatin, such as in hamster cells.

[0141] C. In Vitro Construction of Artificial Chromosomes

[0142] Artificial chromosomes can be constructed in vitro by assemblingthe structural and functional elements that contribute to a completechromosome capable of stable replication and segregation alongsideendogenous chromosomes in cells. The identification of the discreteelements that in combination yield a functional chromosome has madepossible the in vitro generation of artificial chromosomes. The processof in vitro construction of artificial chromosomes, which can be rigidlycontrolled, provides advantages that may be desired in the generation ofchromosomes that, for example, are required in large amounts or that areintended for specific use in transgenic animal systems.

[0143] For example, in vitro construction may be advantageous whenefficiency of time and scale are important considerations in thepreparation of artificial chromosomes. Because in vitro constructionmethods do not involve extensive cell culture procedures, they may beutilized when the time and labor required to transform, feed, cultivate,and harvest cells used in in vivo cell-based production systems isunavailable.

[0144] In vitro construction may also be rigorously controlled withrespect to the exact manner in which the several elements of the desiredartificial chromosome are combined and in what sequence and proportionsthey are assembled to yield a chromosome of precise specifications.These aspects may be of significance in the production of artificialchromosomes that will be used in live animals where it is desirable tobe certain that only very pure and specific DNA sequences in specificamounts are being introduced into the host animal.

[0145] The following describes the processes involved in theconstruction of artificial chromosomes in vitro, utilizing amegachromosome as exemplary starting material.

[0146] 1. Identification and Isolation of the Components of theArtificial Chromosome

[0147] The MACs provided herein, particularly the SATACs, are elegantlysimple chromosomes for use in the identification and isolation ofcomponents to be used in the in vitro construction of artificialchromosomes. The ability to purify MACs to a very high level of purity,as described herein, facilitates their use for these purposes. Forexample, the megachromosome, particularly truncated forms thereof [i.e.cell lines, such as 1 B3 and mM2C1, which are derived from H1 D3(deposited at the European Collection of Animal Cell Culture (ECACC)under Accession No. 96040929, see EXAMPLES below) serve as startingmaterials.

[0148] For example, the mM2C1 cell line contains a micro-megachromosome(˜50-60 kB), which advantageously contains only one centromere, tworegions of integrated heterologous DNA with adjacent rDNA sequences,with the remainder of the chromosomal DNA being mouse major satelliteDNA. Other truncated megachromosomes can serve as a source of telomeres,or telomeres can be provided (see, Examples below regarding constructionof plasmids containing tandemly repeated telomeric sequences). Thecentromere of the mM2C1 cell line contains mouse minor satellite DNA,which provides a useful tag for isolation of the centromeric DNA.

[0149] Additional features of particular SATACs provided herein, such asthe micro-megachromosome of the mM2C1 cell line, that make them uniquelysuited to serve as starting materials in the isolation andidentification of chromosomal components include the fact that thecentromeres of each megachromosome within a single specific cell lineare identical. The ability to begin with a homogeneous centromere source(as opposed to a mixture of different chromosomes having differingcentromeric sequences) greatly facilitates the cloning of the centromereDNA. By digesting purified megachromosomes, particularly truncatedmegachromosomes, such as the micro-megachromosome, with appropriaterestriction endonucleases and cloning the fragments into thecommercially available and well known YAC vectors (see, e.g., Burke etal. (1987) Science 236:806-812), BAC vectors (see, e.g., Shizuya et al.(1992) Proc. Natl. Acad. Sci. U.S.A. 89: 8794-8797 bacterial artificialchromosomes which have a capacity of incorporating 0.9-1 Mb of DNA) orPAC vectors (the P1 artificial chromosome vector which is a P1 plasmidderivative that has a capacity of incorporating 300 kb of DNA and thatis delivered to E. coli host cells by electroporation rather than bybacteriophage packaging; see, e.g., loannou et al. (1994) NatureGenetics 6:84-89; Pierce et al. (1992) Meth. Enzymol. 21 6:549-574;Pierce et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:2056-2060; U.S.Pat. No. 5,300,431 and International PCT application No. WO 92/14819)vectors, it is possible for as few as 50 clones to represent the entiremicro-megachromosome.

[0150] a. Centromeres

[0151] An exemplary centromere for use in the construction of amammalian artificial chromosome is that contained within themegachromosome of any of the megachromosome-containing cell linesprovided herein, such as, for example, H1 D3 and derivatives thereof,such as mM2C1 cells. Megachromosomes are isolated from such cell linesutilizing, for example, the procedures described herein, and thecentromeric sequence is extracted from the isolated megachromosomes. Forexample, the megachromosomes may be separated into fragments utilizingselected restriction endonucleases that recognize and cut at sites that,for instance, are primarily located in the replication and/orheterologous DNA integration sites and/or in the satellite DNA. Based onthe sizes of the resulting fragments, certain undesired elements may beseparated from the centromere-containing sequences. Thecentromere-containing DNA, which could be as large as 1 Mb.

[0152] Probes that specifically recognize the centromeric sequences,such as mouse minor satellite DNA-based probes [see, e.g., Wong et al.(1988) Nucl. Acids Res. 16:11645-11661], may be used to isolate thecentromere-containing YAC, BAC or PAC clones derived from themegachromosome. Alternatively, or in conjunction with the directidentification of centromere-containing megachromosomal DNA, probes thatspecifically recognize the non-centromeric elements, such as probesspecific for mouse major satellite DNA, the heterologous DNA and/orrDNA, may be used to identify and eliminate the non-centromericDNA-containing clones.

[0153] Additionally, centromere cloning methods described herein may beutilized to isolate the centromere-containing sequence of themegachromosome. For example, Example 12 describes the use of YAC vectorsin combination with the murine tyrosinase gene and NMRI/Han mice foridentification of the centromeric sequence.

[0154] Once the centromere fragment has been isolated, it may besequenced and the sequence information may in turn be used in PCRamplification of centromere sequences from megachromosomes or othersources of centromeres. Isolated centromeres may also be tested forfunction in vivo by transferring the DNA into a host mammalian cell.Functional analysis may include, for example, examining the ability ofthe centromere sequence to bind centromere-binding proteins. The clonedcentromere will be transferred to mammalian cells with a selectablemarker gene and the binding of a centromere-specific protein, such asanti-centromere antibodies (e.g., LU851, see, Hadlaczky et al. (1986)Exp. Cell Res. 167:1-15) can be used to assess function of thecentromeres.

[0155] b. Telomeres

[0156] Preferred telomeres are the 1 kB synthetic telomere providedherein (see, Examples). A double synthetic telomere construct, whichcontains a 1 kB synthetic telomere linked to a dominant selectablemarker gene that continues in an inverted orientation may be used forease of manipulation. Such a double construct contains a series ofTTAGGG repeats 3′ of the marker gene and a series of repeats of theinverted sequence, i.e., GGGATT, 5′ of the marker gene as follows:

[0157] (GGGATTT)_(N)—dominant marker gene—(TTAGGG)_(n). Using aninverted marker provides an easy means for insertion, such as by bluntend ligation, since only properly oriented fragments will be selected.

[0158] C. Megareplicator

[0159] The megareplicator sequences, such as the rDNA, provided hereinare preferred for use in in vitro constructs. The rDNA provides anorigin of replication and also provides sequences that facilitateamplification of the artificial chromosome in vivo to increase the sizeof the chromosome to, for example accommodate increasing copies of aheterologous gene of interest as well as continuous high levels ofexpression of the heterologous genes.

[0160] d. Filler Heterochromatin

[0161] Filler heterochromatin, particularly satellite DNA, is includedto maintain structural integrity and stability of the artificialchromosome and provide a structural base for carrying genes within thechromosome. The satellite DNA is typically A/T-rich DNA sequence, suchas mouse major satellite DNA, or G/C-rich DNA sequence, such as hamsternatural satellite DNA. Sources of such DNA include any eukaryoticorganisms that carry non-coding satellite DNA with sufficient A/T or G/Ccomposition to promote ready separation by sequence, such as by FACS, orby density gradients. The satellite DNA may also be synthesized bygenerating sequence containing monotone, tandem repeats of highly A/T-orG/C-rich DNA units.

[0162] The most suitable amount of filler heterochromatin for use inconstruction of the artificial chromosome may be empirically determinedby, for example, including segments of various lengths, increasing insize, in the construction process. Fragments that are too small to besuitable for use will not provide for a functional chromosome, which maybe evaluated in cell-based expression studies, or will result in achromosome of limited functional lifetime or mitotic and structuralstability.

[0163] e. Selectable Marker

[0164] Any convenient selectable marker may be used and at anyconvenient locus in the MAC.

[0165] 2. Combination of the Isolated Chromosomal Elements

[0166] Once the isolated elements are obtained, they may be combined togenerate the complete, functional artificial chromosome. This assemblycan be accomplished for example, by In vitro ligation either insolution, LMP agarose or on microbeads. The ligation is conducted sothat one end of the centromere is directly joined to a telomere. Theother end of the centromere, which serves as the gene-carryingchromosome arm, is built up from a combination of satellite DNA and rDNAsequence and may also contain a selectable marker gene. Another telomereis joined to the end of the gene-carrying chromosome arm. Thegene-carrying arm is the site at which any heterologous genes ofinterest, for example, in expression of desired proteins encodedthereby, are incorporated either during in vitro construction of thechromosome or sometime thereafter.

[0167] 3. Analysis and Testing of the Artificial Chromosome

[0168] Artificial chromosomes constructed in vitro may be tested forfunctionality in In vivo mammalian cell systems, using any of themethods described herein for the SATACs, minichromosomes, or known tothose of skill in the art.

[0169] 4. Introduction of Desired Heterologous DNA into the in VitroSynthesized Chromosome

[0170] Heterologous DNA may be introduced into the in vitro synthesizedchromosome using routine methods of molecular biology, may be introducedusing the methods described herein for the SATACs, or may beincorporated into the in vitro synthesized chromosome as part of one ofthe synthetic elements, such as the heterochromatin. The heterologousDNA may be linked to a selected repeated fragment, and then theresulting construct may be amplified in vitro using the methods for suchin vitro amplification provided herein (see the Examples).

[0171] D. Introduction of Artificial Chromosomes Into Cells, Tissues,Animals and Plants

[0172] Suitable hosts for introduction of the MACs provided herein,include, but are not limited to, any animal or plant, cell or tissuethereof, including, but not limited to: mammals, birds, reptiles,amphibians, insects, fish, arachnids, tobacco, tomato, wheat, plants andalgae. The MACs, if contained in cells, may be introduced by cell fusionor microcell fusion or, if the MACs have been isolated from cells, theymay be introduced into host cells by any method known to those of skillin this art, including but not limited to: direct DNA transfer,electroporation, lipid-mediated transfer, e.g., lipofection andliposomes, microprojectile bombardment, microinjection in cells andembryos, protoplast regeneration for plants, and any other suitablemethod [see, e.g., Weissbach et al. (1988) Methods for Plant MolecularBiology, Academic Press, N.Y., Section VIII, pp. 421-463; Grierson etal. (1988) Plant Molecular Biology, 2d Ed., Blackie, London, Ch. 7-9;see, also U.S. Pat. Nos. 5,491,075; 5,482,928; and 5,424,409; see, also,e.g., U.S. Pat. No. 5,470,708, which describes particle-mediatedtransformation of mammalian unattached cells].

[0173] Other methods for introducing DNA into cells include nuclearmicroinjection and bacterial protoplast fusion with intact cells.Polycations, such as polybrene and polyornithine, may also be used. Forvarious techniques for transforming mammalian cells, see e.g, Keown etal. Methods in Enzymology (1990) Vol. 185, pp. 527-537; and Mansour etal. (1988) Nature 336:348-352.

[0174] For example, isolated, purified artificial chromosomes can beinjected into an embryonic cell line such as a human kidney primaryembryonic cell line [ATCC accession number CRL 1573] or embryonic stemcells [see, e.g., Hogan et al. (1994) Manipulating the Mouse Embryo, A:Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y., see, especially, pages 255-264 and Appendix 3].

[0175] Preferably the chromosomes are introduced by microinjection,using a system such as the Eppendorf automated microinjection system,and grown under selective conditions, such as in the presence ofhygromycin B or neomycin.

[0176] 1. Methods for Introduction of Chromosomes into Hosts

[0177] Depending on the host cell used, transformation is done usingstandard techniques appropriate to such cells. These methods includeany, including those described herein, known to those of skill in theart.

[0178] a. DNA Uptake

[0179] For mammalian cells that do not have cell walls, the calciumphosphate precipitation method for introduction of exogenous DNA [see,e.g., Graham et al. (1978) Virology 52:456-457; Wigler et al. (1979)Proc. Natl. Acad. Sci. U.S.A. 76:1373-1376; and Current Protocols inMolecular Biology, Vol. 1, Wiley Inter-Science, Supplement 14, Unit9.1.1-9.1.9 (1990)] is often preferred. DNA uptake can be accomplishedby DNA alone or in the presence of polyethylene glycol [PEG-mediatedgene transfer], which is a fusion agent, or by any variations of suchmethods known to those of skill in the art [see, e.g., U.S. Pat. No.4,684,611].

[0180] Lipid-mediated carrier systems are also among the preferredmethods for introduction of DNA into cells [see, e.g., Teifel et al.(1995) Biotechniques 19:79-80; Albrecht et al. (1996) Ann. Hematol.72:73-79; Holmen et al. (1995) In Vitro Cell Dev. Biol. Anim.31:347-351; Remy et al. (1994) Bioconiug. Chem. 5:647-654; Le Bolc'h etal. (1995) Tetrahedron Lett. 36:6681-6684; Loeffler et al. (1993) Meth.Enzymol. 217:599-618]. Lipofection [see, e.g., Strauss (1996) Meth. Mol.Biol. 54:307-327] may also be used to introduce DNA into cells. Thismethod is particularly well-suited for transfer of exogenous DNA intochicken cells (e.g., chicken blastodermal cells and primary chickenfibroblasts; see Brazolot et al. (1991) Mol. Repro. Dev. 30:304-312). Inparticular, DNA of interest can be introduced into chickens in operativelinkage with promoters from genes, such as lysozyme and ovalbumin, thatare expressed in the egg, thereby permitting expression of theheterologous DNA in the egg.

[0181] Additional methods useful in the direct transfer of DNA intocells include particle gun electrofusion [see, em, U.S. Pat. Nos.4,955,378, 4,923,814, 4,476,004, 4,906,576 and 4,441,972] andvirion-mediated gene transfer.

[0182] A commonly used approach for gene transfer in land plantsinvolves the direct introduction of purified DNA into protoplasts. Thethree basic methods for direct gene transfer into plant cellsinclude: 1) polyethylene glycol [PEG]-mediated DNA uptake, 2)electroporation-mediated DNA uptake and 3) microinjection. In addition,plants may be transformed using ultrasound treatment [see, e.g.,International PCT application publication No. WO 91/00358].

[0183] b. Electroporation

[0184] Electroporation involves providing high-voltage electrical pulsesto a solution containing a mixture of protoplasts and foreign DNA tocreate reversible pores in the membranes of plant protoplasts as well asother cells. Electroporation is generally used for prokaryotes or othercells, such as plants that contain substantial cell-wall barriers.Methods for effecting electroporation are well known [see, e.g., U.S.Pat. Nos. 4,784,737, 5,501,967, 5,501,662, 5,019,034, 5,503,999; see,also Frommet al. (1985) Proc. Natl. Acad. Sci. U.S.A. 82:5824-5828].

[0185] For example, electroporation is often used for transformation ofplants [see, e.g., Ag Biotechnology News 7:3 and 17 (September/October1990)]. In this technique, plant protoplasts are electroporated in thepresence of the DNA of interest that also includes a phenotypic marker.Electrical impulses of high field strength reversibly permeabilizebiomembranes allowing the introduction of the plasmids. Electroporatedplant protoplasts reform the cell wall, divide, and form plant callus.Transformed plant cells will be identified by virtue of the expressedphenotypic marker. The exogenous DNA may be added to the protoplasts inany form such as, for example, naked linear, circular or supercoiledDNA, DNA encapsulated in liposomes, DNA in spheroplasts, DNA in otherplant protoplasts, DNA complexed with salts, and other methods.

[0186] c. Microcells

[0187] The chromosomes can be transferred by preparing microcellscontaining an artificial chromosome and then fusing with selected targetcells. Methods for such preparation and fusion of microcells are wellknown [see the Examples and also see, e.g., U.S. Pat. Nos. 5,24Q,840,4,806,476, 5,298,429, 5,396,767, Fournier (1981) Proc. Natl. Acad. Sci.U.S.A. 78:6349-6353; and Lambert et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:5907-59]. Microcell fusion, using microcells that contain anartificial chromosome, is a particularly useful method for introductionof MACs into avian cells, such as DT40 chicken pre-B cells [for adescription of DT40 cell fusion, see, e.g., Dieken et al. (1996) NatureGenet. 12:174-182].

[0188] 2. Hosts

[0189] Suitable hosts include any host known to be useful forintroduction and expression of heterologous DNA. Of particular interestherein, animal and plant cells and tissues, including, but not limitedto insect cells and larvae, plants, and animals, particularly transgenic(non-human) animals, and animal cells. Other hosts include, but are notlimited to mammals, birds, particularly fowl such as chickens, reptiles,amphibians, insects, fish, arachnids, tobacco, tomato, wheat, monocots,dicots and algae, and any host into which introduction of heterologousDNA is desired. Such introduction can be effected using the MACsprovided herein, or, if necessary by using the MACs provided herein toidentify species-specific centromeres and/or functional chromosomalunits and then using the resulting centromeres or chromosomal units asartificial chromosomes, or alternatively, using the methods exemplifiedherein for production of MACs to produce species-specific artificialchromosomes.

[0190] a. Introduction of DNA into embryos for production of transgenic(non-human) animals and introduction of DNA into animal cells

[0191] Transgenic (non-human) animals can be produced by introducingexogenous genetic material into a pronucleus of a mammalian zygote bymicroinjection [see, e.g., U.S. Pat. Nos. 4,873,191 and 5,354,674; see,also, International PCT application publication No. WO 95/14769, whichis based on U.S. application Ser. No. 08/159,084]. The zygote is capableof development into a mammal. The embryo or zygote is transplanted intoa host female uterus and allowed to develop. Detailed protocols andexamples are set forth below.

[0192] Nuclear transfer [see, Wilmut et al. (1997) Nature 385:810-813,International PCT application Nos. WO 97/07669 and WO 97/07668]. Brieflyin this method, the SATAC containing the genes of interest is introducedby any suitable method, into an appropriate donor cell, such as amammary gland cell, that contains totipotent nuclei. The diploid nucleusof the cell, which is either in G0 or G1 phase, is then introduced, suchas by cell fusion or microinjection, into an unactivated oöcyte,preferably enucleated cell, which is arrested in the metaphase of thesecond meiotic division. Enucleation may be effected by any suitablemethod, such as actual removal, or by treating with means, such asultraviolet light, that functionally remove the nucleus. The oöcyte isthen activated, preferably after a period of contact, about 6-20 hoursfor cattle, of the new nucleus with the cytoplasm, while maintainingcorrect ploidy, to produce a reconstituted embryo, which is thenintroduced into a host. Ploidy is maintained during activation, forexample, by incubating the reconstituted cell in the presence of amicrotubule inhibitor, such as nocodazole, colchicine, cocemid, andtaxol, whereby the DNA replicates once.

[0193] Transgenic chickens can be produced by injection of dispersedblastodermal cells from Stage X chicken embryos into recipient embryosat a similar stage of development [see e.g., Etches et al. (1993)Poultry Sci. 72:882-889; Petitte et al. (1990) Development 108:185-189].Heterologous DNA is first introduced into the donor blastodermal cellsusing methods such as, for example, lipofection [see, e.g., Brazolot etal. (1991) Mol. Repro. Dev. 30:304-312] or microcell fusion [see, e.g.,Dieken et al. (1996) Nature Genet. 12:174-182]. The transfected donorcells are then injected into recipient chicken embryos [see e.g,Carsience et al. (1993) Development 117: 669-675]. The recipient chickenembryos within the shell are candled and allowed to hatch to yield agermline chimeric chicken.

[0194] DNA can be introduced into animal cells using any knownprocedure, including, but not limited to: direct uptake, incubation withpolyethylene glycol [PEG], microinjection, electroporation, lipofection,cell fusion, microcell fusion, particle bombardment, includingmicroprojectile bombardment [see, e.g., U.S. Pat. No. 5,470,708, whichprovides a method for transforming unattached mammalian cells viaparticle bombardment], and any other such method. For example, thetransfer of plasmid DNA in liposomes directly to human cells in situ hasbeen approved by the FDA for use in humans [see, e.g., Nabel, et al.(1990) Science 249:1285-1288 and U.S. Pat. No. 5,461,032].

[0195] b. Introduction of heterologous DNA into plants

[0196] Numerous methods for producing or developing transgenic plantsare available to those of skill in the art. The method used is primarilya function of the species of plant. These methods include, but are notlimited to: direct transfer of DNA by processes, such as PEG-induced DNAuptake, protoplast fusion, microinjection, electroporation, andmicroprojectile bombardment [see, e.g., Uchimiya et al. (1989) J. ofBiotech. 12: 1-20 for a review of such procedures, see, also, e.g., U.S.Pat. Nos. 5,436,392 and 5,489,520 and many others]. For purposes herein,when introducing a MAC, microinjection, protoplast fusion and particlegun bombardment are preferred.

[0197] Plant species, including tobacco, rice, maize, rye, soybean,Brassica napus, cotton, lettuce, potato and tomato, have been used toproduce transgenic plants. Tobacco and other species, such as petunias,often serve as experimental models in which the methods have beendeveloped and the genes first introduced and expressed.

[0198] DNA uptake can be accomplished by DNA alone or in the presence ofPEG, which is a fusion agent, with plant protoplasts or by anyvariations of such methods known to those of skill in the art [see,e.g., U.S. Pat. No. 4,684,611 to Schilperoot et al.]. Electroporation,which involves high-voltage electrical pulses to a solution containing amixture of protoplasts and foreign DNA to create reversible pores, hasbeen used, for example, to successfully introduce foreign genes intorice and Brassica napus. Microinjection of DNA into plant cells,including cultured cells and cells in intact plant organs and embryoidsin tissue culture and microprojectile bombardment [acceleration of smallhigh density particles, which contain the DNA, to high velocity with aparticle gun apparatus, which forces the particles to penetrate plantcell walls and membranes] have also been used. All plant cells intowhich DNA can be introduced and that can be regenerated from thetransformed cells can be used to produce transformed whole plants whichcontain the transferred artificial chromosome. The particular protocoland means for introduction of the DNA into the plant host may need to beadapted or refined to suit the particular plant species or cultivar.

[0199] c. Insect cells

[0200] Insects are useful hosts for introduction of artificialchromosomes for numerous reasons, including, but not limited to: (a)amplification of genes encoding useful proteins can be accomplished inthe artificial chromosome to obtain higher protein yields in insectcells; (b) insect cells support required post-translationalmodifications, such as glycosylation and phosphorylation, that can berequired for protein biological functioning; (c) insect cells do notsupport mammalian viruses, and, thus, eliminate the problem ofcross-contamination of products with such infectious agents; (d) thistechnology circumvents traditional recombinant baculovirus systems forproduction of nutritional, industrial or medicinal proteins in insectcell systems; (e) the low temperature optimum for insect cell growth(28° C.) permits reduced energy cost of production; (f) serum-freegrowth medium for insect cells permits lower production costs; (g)artificial chromosome-containing cells can be stored indefinitely at lowtemperature; and (h) insect larvae will be biological factories forproduction of nutritional, medicinal or industrial proteins bymicroinjection of fertilized insect eggs [see, e.g, Joy et al. (1991)Current Science 66:145-150, which provides a method for microinjectingheterologous DNA into Bombyx mori eggs].

[0201] Either MACs or insect-specific artificial chromosomes [BUGACs]will be used to introduce genes into insects. As described in theExamples, it appears that MACs will function in insects to directexpression of heterologous DNA contained thereon. For example, asdescribed in the Examples, a MAC containing the B. mori actin genepromoter fused to the lacZ gene has been generated by transfection ofEC3/7C5 cells with a plasmid containing the fusion gene. Subsequentfusion of the B. mori cells with the transfected EC3/7C5 cells thatsurvived selection yielded a MAC-containing insect-mouse hybrid cellline in which β-galactosidase expression was detectable.

[0202] Insect host cells include, but are not limited to, hosts such asSpodoptera frugiperda [caterpillar], Aedes aegypti [mosquito], Aedesalbopictus [mosquito], Drosphila melanogaster [fruitfly], Bombyx mori[silkworm], Manduca sexta [tomato horn worm] and Trichoplusia ni[cabbage looper]. Efforts have been directed toward propagation ofinsect cells in culture. Such efforts have focused on the fall armyworm,Spodoptera frugiperda. Cell lines have been developed also from otherinsects such as the cabbage looper, Trichoplusia ni and the silkworm,Bombyx mori. It has also been suggested that analogous cell lines can becreated using the tomato hornworm, Manduca sexta. To introduce DNA intoan insect, it should be introduced into the larvae, and allowed toproliferate, and then the hemolymph recovered from the larvae so thatthe proteins can be isolated therefrom.

[0203] The preferred method herein for introduction of artificialchromosomes into insect cells is microinjection [see, e.g., Tamura etal. (1991) Bio Ind. 8:26-31; Nikolaev et al. (1989) Mol. Biol. (Moscow)23:1177-87; and methods exemplified and discussed herein].

[0204] E. Applications for and Uses of Artificial Chromosomes

[0205] Artificial chromosomes provide convenient and useful vectors, andin some instances [e.g, in the case of very large heterologous genes]the only vectors, for introduction of heterologous genes into hosts.Virtually any gene of interest is amenable to introduction into a hostvia artificial chromosomes. Such genes include, but are not limited to,genes that encode receptors, cytokines, enzymes, proteases, hormones,growth factors, antibodies, tumor suppressor genes, therapeutic productsand multigene pathways.

[0206] The artificial chromosomes provided herein will be used inmethods of protein and gene product production, particularly usinginsects as host cells for production of such products, and in cellular(em, mammalian cell) production systems in which the artificialchromomsomes (particularly MACs) provide a reliable, stable andefficient means for optimizing the biomanufacturing of importantcompounds for medicine and industry. They are also intended for use inmethods of gene therapy, and for production of transgenic plants andanimals [discussed above, below and in the EXAMPLES].

[0207] 1. Gene Therapy

[0208] Any nucleic acid encoding a therapeutic gene product or productof a multigene pathway may be introduced into a host animal, such as ahuman, or into a target cell line for introduction into an animal, fortherapeutic purposes. Such therapeutic purposes include, genetic therapyto cure or to provide gene products that are missing or defective, todeliver agents, such as anti-tumor agents, to targeted cells or to ananimal, and to provide gene products that will confer resistance orreduce susceptibility to a pathogen or ameliorate symptoms of a diseaseor disorder. The following are some exemplary genes and gene products.Such exemplification is not intended to be limiting.

[0209] a. Anti-HIV ribozymes

[0210] As exemplified below, DNA encoding anti-HIV ribozymes can beintroduced and expressed in cells using MACs, including theeuchromatin-based minichromosomes and the SATACs. These MACs can be usedto make a transgenic mouse that expresses a ribozyme and, thus, servesas a model for testing the activity of such ribozymes or from whichribozyme-producing cell lines can be made. Also, introduction of a MACthat encodes an anti-HIV ribozyme into human cells will serve astreatment for HIV infection. Such systems further demonstrate theviability of using any disease-specific ribozyme to treat or amelioratea particular disease.

[0211] b. Tumor Suppressor Genes

[0212] Tumor suppressor genes are genes that, in their wild-typealleles, express proteins that suppress abnormal cellular proliferation.When the gene coding for a tumor suppressor protein is mutated ordeleted, the resulting mutant protein or the complete lack of tumorsuppressor protein expression may result in a failure to correctlyregulate cellular proliferation. Consequently, abnormal cellularproliferation may take place, particularly if there is already existingdamage to the cellular a regulatory mechanism. A number of well-studiedhuman tumors and tumor cell lines have been shown to have missing ornonfunctional tumor suppressor genes.

[0213] Examples of tumor suppression genes include, but are not limitedto, the retinoblastoma susceptibility gene or RB gene, the p53 gene, thegene that is deleted in colon carcinoma [i.e., the DCC gene] and theneurofibromatosis type 1 [NF-1] tumor suppressor gene [see, e.g., U.S.Pat. No. 5,496,731; Weinberg et al. (1991) 254:1138-1146]. Loss offunction or inactivation of tumor suppressor genes may play a centralrole in the initiation and/or progression of a significant number ofhuman cancers.

[0214] The p53 Gene

[0215] Somatic cell mutations of the p53 gene are said to be the mostfrequent of the gene mutations associated with human cancer [see, e.g.,Weinberg et al. (1991) Science 254:1138-11461. The normal or wild-typep53 gene is a negative regulator of cell growth, which, when damaged,favors cell transformation. The p53 expression product is found in thenucleus, where it may act in parallel or cooperatively with other geneproducts. Tumor cell lines in which p53 has been deleted have beensuccessfully treated with wild-type p53 vector to reduce tumorigenicity[see, Baker et al. (1990) Science 249:912-915].

[0216] DNA encoding the p53 gene and plasmids containing this DNA arewell known [see, e.g., U.S. Pat. No. 5,260,191; see, also Chen et al.(1990) Science 250:1576; Farrel et al. (1991) EMBO J. 10:2879-2887;plasmids containing the gene are available from the ATCC, and thesequence is in the GenBank Database, accession nos. X54156, X60020,M14695, M16494, K03199].

[0217] c. The CFTR gene

[0218] Cystic fibrosis [CF] is an autosomal recessive disease thataffects epithelia of the airways, sweat glands, pancreas, and otherorgans. It is a lethal genetic disease associated with a defect inchloride ion transport, and is caused by mutations in the gene codingfor the cystic fibrosis transmembrane conductance regulator [CFTR], a1480 amino acid protein that has been associated with the expression ofchloride conductance in a variety of eukaryotic cell types. Defects inCFTR destroy or reduce the ability of epithelial cells in the airways,sweat glands, pancreas and other tissues to transport chloride ions inresponse to cAMP-mediated agonists and impair activation of apicalmembrane channels by cAMP-dependent protein kinase A [PKA]. Given thehigh incidence and devastating nature of this disease, development ofeffective CF treatments is imperative.

[0219] The CFTR gene [˜250 kb] can be transferred into a MAC for use,for example, in gene therapy as follows. A CF-YAC [see Green et al.Science 250:94-98] may be modified to include a selectable marker, suchas a gene encoding a protein that confers resistance to puromycin orhygromycin, and λ-DNA for use in site-specific integration into aneo-minichromosome or a SATAC. Such a modified CF-YAC can be introducedinto MAC-containing cells, such as EC3/7C5 or 19C5xHa4 cells, by fusionwith yeast protoplasts harboring the modified CF-YAC or microinjectionof yeast nuclei harboring the modified CF-YAC into the cells. Stabletransformants are then selected on the basis of antibiotic resistance.These transformants will carry the modified CF-YAC within the MACcontained in the cells.

[0220] 2. Animals, Birds, Fish and Plants that are Genetically Alteredto Possess Desired Traits such as Resistance to Disease

[0221] Artificial chromosomes are ideally suited for preparing animals,including vertebrates and invertebrates, including birds and fish aswell as mammals, that possess certain desired traits, such as, forexample, disease resistance, resistance to harsh environmentalconditions, altered growth patterns, and enhanced physicalcharacteristics.

[0222] One example of the use of artificial chromosomes in generatingdisease-resistant organisms involves the preparation of multivalentvaccines. Such vaccines include genes encoding multiple antigens thatcan be carried in a MAC, or species-specific artificial chromosome, andeither delivered to a host to induce immunity, or incorporated intoembryos to produce transgenic (non-human) animals and plants that areimmune or less susceptible to certain diseases.

[0223] Disease-resistant animals and plants may also be prepared inwhich resistance or decreased susceptibility to disease is conferred byintroduction into the host organism or embryo of artificial chromosomescontaining DNA encoding gene products (e, ribozymes and proteins thatare toxic to certain pathogens) that destroy or attenuate pathogens orlimit access of pathogens to the host.

[0224] Animals and plants possessing desired traits that might, forexample, enhance utility, processibility and commercial value of theorganisms in areas such as the agricultural and ornamental plantindustries may also be generated using artificial chromosomes in thesame manner as described above for production of disease-resistantanimals and plants. In such instances, the artificial chromosomes thatare introduced into the organism or embryo contain DNA encoding geneproducts that serve to confer the desired trait in the organism.

[0225] Birds, particularly fowl such as chickens, fish and crustaceanswill serve as model hosts for production of genetically alteredorganisms using artificial chromosomes.

[0226] 3. Use of MACs and other Artificial Chromosomes for Preparationand Screening of Libraries

[0227] Since large fragments of DNA can be incorporated into eachartificial chromosome, the chromosomes are well-suited for use ascloning vehicles that can accommodate entire genomes in the preparationof genomic DNA libraries, which then can be readily screened. Forexample, MACs may be used to prepare a genomic DNA library useful in theidentification and isolation of functional centromeric DNA fromdifferent species of organisms. In such applications, the MAC used toprepare a genomic DNA library from a particular organism is one that isnot functional in cells of that organism. That is, the MAC does notstably replicate, segregate or provide for expression of genes containedwithin it in cells of the organism. Preferably, the MACs contain anindicator gene (e.g., the lacZ gene encoding B-galactosidase or genesencoding products that confer resistance to antibiotics such asneomycin, puromycin, hygromycin) linked to a promoter that is capable ofpromoting transcription of the indicator gene in cells of the organism.Fragments of genomic DNA from the organism are incorporated into theMACs, and the MACs are transferred to cells from the organism. Cellsthat contain MACs that have incorporated functional centromerescontained within the genomic DNA fragments are identified by detectionof expression of the marker gene.

[0228] 4. Use of MACs and other Artificial Chromosomes for Stable,High-level Protein Production

[0229] Cells containing the MACs and/or other artificial chromosomesprovided herein are advantageously used for production of proteins,particularly several proteins from one cell line, such as multipleproteins involved in a biochemical pathway or multivalent vaccines. Thegenes encoding the proteins are introduced into the artificialchromosomes which are then introduced into cells. Alternatively, theheterologous gene(s) of interest are transferred into a production cellline that already contains artificial chromosomes in a manner thattargets the gene(s) to the artificial chromosomes. The cells arecultured under conditions whereby the heterologous proteins areexpressed. Because the proteins will be expressed at high levels in astable permanent extra-genomic chromosomal system, selective conditionsare not required.

[0230] Any transfectable cells capable of serving as recombinant hostsadaptable to continuous propagation in a cell culture system [see, e.g.,McLean (1993) Trends In Biotech. 11-:232-238] are suitable for use in anartificial chromosome-based protein production system. Exemplary hostcell lines include, but are not limited to, the following: Chinesehamster ovary (CHO) cells [see, e.g., Zang et al (1995) Biotechnology13:389-392], HEK 293, Ltk-, COS-7, DG44, and BHK cells. CHO cells areparticularly preferred host cells. Selection of host cell lines for usein artificial chromosome-based protein production systems is within theskill of the art, but often will depend on a variety of factors,including the properties of the heterologous protein to be produced,potential toxicity of the protein in the host cell, any requirements forpost-translational modification (e.g., glycosylation, amination,phosphorylation) of the protein, transcription factors available in thecells, the type of promoter element(s) being used to drive expression ofthe heterologous gene, whether production will be completelyintracellular or the heterologous protein will preferably be secretedfrom the cell, and the types of processing enzymes in the cell.

[0231] The artificial chromosome-based system for heterologous proteinproduction has many advantageous features. For example, as describedabove, because the heterologous DNA is located in an independent,extra-genomic artificial chromosome (as opposed to randomly inserted inan unknown area of the host cell genome or located as extrachromosomalelement(s) providing only transient expression) it is stably maintainedin an active transcription unit and is not subject to ejection viarecombination or elimination during cell division. Accordingly, it isunnecessary to include a selection gene in the host cells and thusgrowth under selective conditions is also unnecessary. Furthermore,because the artificial chromosomes are capable of incorporating largesegments of DNA, multiple copies of the heterologous gene and linkedpromoter element(s) can be retained in the chromosomes, therebyproviding for high-level expression of the foreign protein(s).Alternatively, multiple copies of the gene can be linked to a singlepromoter element and several different genes may be linked in a fusedpolygene complex to a single promoter for expression of, for example,all the key proteins constituting a complete metabolic pathway [see,e.g., Beck von Bodman et al. (1995) Biotechnology 13:587-591].Alternatively, multiple copies of a single gene can be operativelylinked to a single promoter, or each or one or several copies may belinked to different promoters or multiple copies of the same promoter.Additionally, because artificial chromosomes have an almost unlimitedcapacity for integration and expression of foreign genes, they can beused not only for the expression of genes encoding end-products ofinterest, but also for the expression of genes associated with optimalmaintenance and metabolic management of the host cell, e.g., genesencoding growth factors, as well as genes that may facilitate rapidsynthesis of correct form of the desired heterologous protein product,e.g., genes encoding processing enzymes and transcription factors. TheMACS are suitable for expression of any proteins or peptides, includingproteins and peptides that require In vivo posttranslationalmodification for their biological activity. Such proteins include, butare not limited to antibody fragments, full-length antibodies, andmultimeric antibodies, tumor suppressor proteins, naturally occurring orartificial antibodies and enzymes, heat shock proteins, and others.

[0232] Thus, such cell-based “protein factories” employing MACs cangenerated using MACs constructed with multiple copies [theoretically anunlimited number or at least up to a number such that the resulting MACis about up to the size of a genomic chromosome (i.e., endogenous)] ofprotein-encoding genes with appropriate promoters, or multiple genesdriven by a single promoter, i.e., a fused gene complex [such as acomplete metabolic pathway in plant expression system; see, e.g., Beckvon Bodman (1995) Biotechnology 13:587-591]. Once such MAC isconstructed, it can be transferred to a suitable cell culture system,such as a CHO cell line in protein-free culture medium [see, e.g, (1995)Biotechnology 13:389-39] or other immortalized cell lines [see, e.g.,(1993) TIBTECH 11:232-238] where continuous production can beestablished.

[0233] The ability of MACs to provide for high-level expression ofheterologous proteins in host cells is demonstrated, for example, byanalysis of the H1 D3 and G3D5 cell lines described herein and depositedwith the ECACC. Northern blot analysis of mRNA obtained from these cellsreveals that expression of the hygromycin-resistance and β-galactosidasegenes in the cells correlates with the amplicon number of themegachromosome(s) contained therein.

[0234] F. Methods for the Synthesis of DNA Sequences Containing RepeatedDNA Units

[0235] Generally, assembly of tandemly repeated DNA poses difficultiessuch as unambiguous annealing of the complementary oligos. For example,separately annealed products may ligate in an inverted orientation.Additionally, tandem or inverted repeats are particularly susceptible torecombination and deletion events that may disrupt the sequence.Selection of appropriate host organisms (e.g., rec strains) for use inthe cloning steps of the synthesis of sequences of tandemly repeated DNAunits may aid in reduction and elimination of such events.

[0236] Methods are provided herein for the synthesis of extended DNAsequences containing repeated DNA units. These methods are particularlyapplicable to the synthesis of arrays of tandemly repeated DNA units,which are generally difficult or not possible to construct utilizingother known gene assembly strategies. A specific use of these methods isin the synthesis of sequences of any length containing simple (e.g.,ranging from 2-6 nucleotides) tandem repeats (such as telomeres andsatellite DNA repeats and trinucleotide repeats of possible clinicalsignificance) as well as complex repeated DNA sequences. An particularexample of the synthesis of a telomere sequence containing over 1 50successive repeated hexamers utilizing these methods is provided herein.

[0237] The methods provided herein for synthesis of arrays of tandem DNArepeats are based in a series of extension steps in which successivedoublings of a sequence of repeats results in an exponential expansionof the array of tandem repeats. These methods provide several advantagesover previously known methods of gene assembly. For instance, thestarting oligonucleotides are used only once. The intermediates in, aswell as the final product of, the construction of the DNA arraysdescribed herein may be obtained in cloned form in a microbial organism(e.g.,, E. coli and yeast). Of particular significance, with regard tothese methods is the fact that sequence length increases exponentially,as opposed to linearly, in each extension step of the procedure eventhough only two oligonucleotides are required in the methods. Theconstruction process does not depend on the compatibility of restrictionenzyme recognition sequences and the sequence of the repeated DNAbecause restriction sites are used only temporarily during the assemblyprocedure. No adaptor is necessary, though a region of similar functionis located between two of the restriction sites employed in the process.The only limitation with respect to restriction site use is that the twosites employed in the method must not be present elsewhere in the vectorutilized in any cloning steps. These procedures can also be used toconstruct complex repeats with perfectly identical repeat units, such asthe variable number tandem repeat (VNTR) 3′ of the human apolipoproteinB100 gene (a repeat unit of 30 bp, 100% AT) or alphoid satellite DNA.

[0238] The method of synthesizing DNA sequences containing tandemrepeats may generally be described as follows.

[0239] 1. Starting Materials

[0240] Two oligonucleotides are utilized as starting materials.Oligonucleotide 1 is of length k of repeated sequence (the flanks ofwhich are not relevant) and contains a relatively short stretch (60-90nucleotides) of the repeated sequence, flanked with appropriately chosenrestriction sites:

[0241] 5′-S1>>>>>>>>>>>>>>>>>>>>>>>>>>>S2_-3′

[0242] wherein S1 is restriction site 1 cleaved by E1 [preferably anenzyme producing a 3′-overhang (e.g., PacI, PstI, Sphl, Nsil, etc.) orblunt-end], S2 is a second restriction site cleaved by E2 (preferably anenzyme producing a 3′-overhang or one that cleaves outside therecognition sequence, such as TspRI), > represents a simple repeat unit,and ‘_’ denotes a short (8-10) nucleotide flanking sequencecomplementary to oligonucleotide 2:

[0243]3′-_S3-5′

[0244] wherein S3 is a third restriction site for enzyme E3 and which ispresent in the vector to be used during the construction.

[0245] Because there is a large variety of restriction enzymes thatrecognize many different DNA sequences as cleavage sites, it shouldalways be possible to select sites and enzymes (preferably those thatyield a 3′-protruding end) suitable for these methods in connection withthe synthesis of any one particular repeat arrary. In most cases, only 1(or perhaps 2) nucleotide(s) has of a restriction site is required to bepresent in the repeat sequence, and the remaining nucleotides of therestriction site can be removed, for example: PacI: TTAAT/TAA--(Klenow/dNTP) TAA-- PstI: CTGCA/G-- (Klenow/dNTP) G-- NsiI: ATGCA/T--(Klenow/dNTP) T-- KpnI: GGTAC/C-- (Klenow/dNTP) C--

[0246] Though there is no known restriction enzyme leaving a single Abehind, this problem can be solved with enzymes leaving behind none atall, for example: TaiI: ACGT/ (Klenow/dNTP) -- NlaIII: CATG/(Klenow/dNTP) --

[0247] Additionally, if mung bean nuclease is used instead of Klenow,then the following

[0248] Xbal: T/CTAGA Mung bean nuclease A—

[0249] Furthermore, there are a number of restriction enzymes that cutoutside of the recognition sequence, and in this case, there is nolimitation at all: TspRI NNCAGTGNN/-- (Klenow/dNTP) -- BsmI GAATG CN/--(Klenow/dNTP) -- CTTAC/GN-- (Klenow/dNTP) --

[0250] 2. Step 1-Annealing

[0251] Oligonucleotides 1 and 2 are annealed at a temperature selecteddepending on the length of overlap (typically in the range of 30-65°C.).

[0252] 3. Step 2-Generating a Double-stranded Molecule

[0253] The annealed oligonucleotides are filled-in with Klenowpolymerase in the presence of dNTP to produce a double-stranded (ds)sequence: 5′-S1>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>S2_(———)S3-3′3′-S1<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<S2_(———)S3-5′

[0254] 4. Step 3-Incorporation of Double-stranded DNA into a Vector

[0255] The double-stranded DNA is cleaved with restriction enzymes E1and E3 and subsequently ligated into a vector (e.g., pUC1 9 or a yeastvector) that has been cleaved with the same enzymes E1 and E3. Theligation product is used to transform competent host cells compatiblewith the vector being used (e.g., when pUC19 is used, bacterial cellssuch as E. coli DH5α are suitable hosts) which are then plated ontoselection plates. Recombinants can be identified either by color (e.g.,by X-gal staining for,8β-galactosidase expression) or by colonyhybridization using ³²P-labeled oligonucleotide 2 (detection byhybridization to oligonucleotide 2 is preferred because its sequence isremoved in each of the subsequent extension steps and thus is presentonly in recombinants that contain DNA that has undergone successfulextension of the repeated sequence).

[0256] 5. Step 4-Isolation of Insert from the Plasmid

[0257] An aliquot of the recombinant plasmid containing k nucleotides ofthe repeat sequence is digested with restriction enzymes E1 and E3, andthe insert is isolated on a gel (native polyacrylamide while the insertis short, but agarose can be used for isolation of longer inserts insubsequent steps). A second aliquot of the recombinant plasmid is cutwith enzymes E2 (treated with Klenow and dNTP to remove the 3′-overhang)and E3, and the large fragment (plasmid DNA plus the insert) isisolated.

[0258] 6. Step 5-Extension of the DNA Sequence of k Repeats

[0259] The two DNAs (the S1-S3 insert fragment and the vector plusinsert) are ligated, plated to selective plates, and screened forextended recombinants as in Step 3. Now the length of the repeatsequence between restriction sites is twice that of the repeat sequencein the previous step, i.e., 2xk.

[0260] 7. Step 6-Extension of the DNA Sequence of 2xk Repeats

[0261] Steps 4 and 5 are repeated as many times as needed to achieve thedesired repeat sequence size. In each extension cycle, the repeatsequence size doubles, i.e., if m is the number of extension cycles, thesize of the repeat sequence will be k x 2^(m) nucleotides.

[0262] The following examples are included for illustrative purposesonly and are not intended to limit the scope of the invention.

EXAMPLE 1

[0263] General Materials and Methods

[0264] The following materials and methods are exemplary of methods thatare used in the following Examples and that can be used to prepare celllines containing artificial chromosomes. Other suitable materials andmethods known to those of skill in the art may used. Modifications ofthese materials and methods known to those of skill in the art may alsobe employed.

[0265] A. Culture of Cell Lines, Cell Fusion, and Transfection of Cells

[0266] 1. Chinese hamster K-20 cells and mouse A9 fibroblast cells werecultured in F-12 medium. EC3/7 [see, U.S. Pat. No. 5,288,625, anddeposited at the European Collection of Animal cell Culture (ECACC)under accession no. 90051001; see, also Hadlaczky et al. (1991) Proc.Natl. Acad. Sci. U.S.A. 88:8106-8110 and U.S. application Ser. No.08/375,271] and EC3/7C5 [see, U.S. Pat. No. 5,288,625 and Praznovszky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:11042-11046] mouse celllines, and the KE1-2/4 hybrid cell line were maintained in F-1 2 mediumcontaining 400 g/ml G418 [SIGMA, St. Louis, Mo.].

[0267] 2. TF1004G19 and TF1004G-19C5 mouse cells, described below, andthe 19C5xHa4 hybrid, described below, and its sublines were cultured inF-1 2 medium containing up to 400 pg/ml Hygromycin B [Calbiochem]. LP11cells were maintained in F-12 medium containing 3-15 μg/ml Puromycin[SIGMA, St. Louis, Mo.].

[0268] 3. Cotransfection of EC3/7C5 cells with plasmids [pH132, pCH110available from Pharmacia, see, also Hall et al. (1983) J. Mol. Appl.Gen. 2:101-109] and with A DNA was conducted using the calcium phosphateDNA precipitation method [see, e.g., Chen et al. (1987) Mol. Cell. Biol.7:2745-2752], using 2-5 pg plasmid DNA and 20 μg λ phage DNA per 5×10⁶recipient cells.

[0269] 4. Cell Fusion

[0270] Mouse and hamster cells were fused using polyethylene glycol[Davidson et al. (1976) Som. Cell Genet. 2:165-176]. Hybrid cells wereselected in HAT medium containing 400 μg/ml Hygromycin B.

[0271] Approximately 2×10⁷ recipient and 2×10⁶ donor cells were fusedusing polyethylene glycol [Davidson et al. (1976) Som. Cell Genet.2:165-176]. Hybrids were selected and maintained in F-1 2/HAT medium[Szybalsky et al. (1 962) Natl. Cancer Inst. Monogr. 7:75-89] containing10% FCS and 400 μg/ml G418. The presence of “parental” chromosomes inthe hybrid cell lines was verified by in situ hybridization withspecies-specific probes using biotin-labeled human and hamster genomicDNA, and a mouse long interspersed repetitive DNA [pMCPE1.51].

[0272] 5. Microcell Fusion

[0273] Microcell-mediated transfer of artificial chromosomes fromEC3/7C5 cells to recipient cells was done according to Saxon et al.[(1985) Mol. Cell. Biol. 1 :140-146] with the modifications ofGoodfellow et al. [(1989) Techniques for mammalian genome transfer. InGenome Analysis a Practical Approach. K. E. Davies, ed., IRL Press,Oxford, Washington D.C. pp.1-17] and Yamada et al.[(1990) Oncogene5:1141-1147]. Briefly, 5×10 ⁶ EC3/7C5 cells in a T25 flask were treatedfirst with 0.05 μg/ml colcemid for 48 hr and then with 10 μg/mlcytochalasin B for 30 min. The T25 flasks were centrifuged on edge andthe pelleted microcells were suspended in serum free DME medium. Themicrocells were filtered through first a 5 micron and then a 3 micronpolycarbonate filter, treated with 50 μg/ml of phytohemagglutin, andused for polyethylene glycol mediated fusion with recipient cells.Selection of cells containing the MMCneo was started 48 hours afterfusion in medium containing 400-800 μg/ml G418.

[0274] Microcells were also prepared from 1 B3 and GHB42 donor cells asfollows in order to be fused with E2D6K cells [a CHO K-20 cell linecarrying the puromycin N-acetyltransferase gene, i.e., the puromycinresistance gene, under the control of the SV40 early promoter]. Thedonor cells were seeded to achieve 60-75% confluency within 24-36 hours.After that time, the cells were arrested in mitosis by exposure tocolchicine (10 μg/ml) for 12 or 24 hours to induce micronucleation. Topromote micronucleation of GHB42 cells, the cells were exposed tohypotonic treatment (10 min at 37° C.). After colchicine treatment, orafter colchicine and hypotonic treatment, the cells were grown incolchicine-free medium.

[0275] The donor cells were trypsinized and centrifuged and the pelletswere suspended in a 1:1 Percoll medium and incubated for 30-40 min at37° C. After the incubation, 1-3×10⁷ cells (60-70% micronucleationindex) were loaded onto each Percoll gradient (each fusion wasdistributed on 1-2 gradients). The gradients were centrifuged at 19,000rpm for 80 min in a Sorvall SS-34 rotor at 34-37° C. Aftercentrifugation, two visible bands of cells were removed, centrifuged at2000 rpm, 10 min at 4° C., resuspended and filtered through 8 μm poresize nucleopore filters.

[0276] The microcells prepared from the 1B3 and GHB42 cells were fusedwith E2D6K. The E2D6K cells were generated by CaPO₄ transfection of CHOK-20 cells with pCHTV2. Plasmid pCHTV2 contains the puromycin-resistancegene linked to the SV40 promoter and polyadenylation signal, theSaccharomyces cerevisiae URA3 gene, 2.4- and 3.2-kb fragments of aChinese hamster chromosome 2-specific satellite DNA (HC-2 satellite; seeFatyol et al. (1994) Nuc. Acids Res. 22:3728-3736), two copies of thediptheria toxin-A chain gene (one linked to the herpes simplex virusthymidine kinase (HSV-TK) gene promoter and SV40 polyadenylation signaland the other linked to the HSV-TK promoter without a polyadenylationsignal), the ampicillin-resistance gene and the CoIE1 origin ofreplication. Following transfection, puromycin-resistant colonies wereisolated. The presence of the pCHTV2 plasmid in the E2D6K cell line wasconfirmed by nucleic acid amplification of DNA isolated from the cells.

[0277] The purified microcells were centrifuged as described above andresuspended in 2 ml of phytohemagglutinin-P (PHA-P, 100 μg/ml). Themicrocell suspension was then added to a 60-70% confluent recipientculture of E2D6K cells. The preparation was incubated at roomtemperature for 30-40 min to agglutinate the microcells. After the PHA-Pwas removed, the cells were incubated with 1 ml of 50%polyethylene-glycol (PEG) for one min. The PEG was removed and theculture was washed three times with F-12 medium without serum. The cellswere incubated in non-selective medium for 48-60 hours. After this time,the cell culture was trypsinized and plated in F-12 medium containing400 μg/ml hygromycin B and 10 g/ml puromycin to select against theparental cell lines.

[0278] Hybrid clones were isolated from the cells that had been culturedin selective medium. These clones were then analyzed for expression ofβ-galactosidase by the X-gal staining method. Four of five hybrid clonesanalyzed that had been generated by fusion of GHB42 microcells withE2D6K cells yielded positive staining results indicating expression ofβ-galactosidase from the lacZ gene contained in the megachromosomecontributed by the GHB42 cells. Similarly, a hybrid clone that had beengenerated by fusion of 1 B3 microcells with E2D6K cells yielded positivestaining results indicating expression of β-galactosidase from the lacZgene contained in the megachromosome contributed by the 1 B3 cells. Insitu hybridization analysis of the hybrid clones is also performed toanalyze the mouse chromosome content of the mouse-hamster hybrid cells.

[0279] B. Chromosome banding

[0280] Trypsin G-banding of chromosomes was performed using the methodof Wang & Fedoroff [(1 972) Nature 235:52-54], and the detection ofconstitutive heterochromatin with the BSG. C-banding method was doneaccording to Sumner [(1972) Exp. Cell Res. 75:304-306]. For thedetection of chromosome replication by bromodeoxyuridine [BrdU]incorporation, the Fluorescein Plus Giemsa [FPG] staining method ofPerry & Wolff [(1974) Nature 251:156-158] was used.

[0281] C. Immunolabelling of chromosomes and in situ hybridization

[0282] Indirect immunofluorescence labelling with human anti-centromereserum LU851 [Hadlaczky et al. (1986) Exp,. Cell Res. 167:1-15], andindirect immunofluorescence and in situ hybridization on the samepreparation were performed as described previously [see, Hadlaczky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-8110, see, also U.S.application Ser. No. 08/375,271]. Immunolabelling withfluorescein-conjugated anti-BrdU monoclonal antibody [Boehringer] wasperformed according to the procedure recommended by the manufacturer,except that for treatment of mouse A9 chromosomes, 2 M hydrochloric acidwas used at 37° C. for 25 min, and for chromosomes of hybrid cells, 1 Mhydrochloric acid was used at 37° C. for 30 min.

[0283] D. Scanning electron microscopy

[0284] Preparation of mitotic chromosomes for scanning electronmicroscopy using osmium impregnation was performed as describedpreviously [Sumner (1991) Chromosoma 100:410-418]. The chromosomes wereobserved with a Hitachi S-800 field emission scanning electronmicroscope operated with an accelerating voltage of 25 kV.

[0285] E. DNA manipulations, plasmids and probes

[0286] 1. General Methods

[0287] All general DNA manipulations were performed by standardprocedures [see, e.g., Sambrook et al. (1989) Molecular cloning: ALaboratory Manual Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.]. The mouse major satellite probe was provided by Dr. J. B.Rattner [University of Calgary, Alberta, Canada]. Cloned mouse satelliteDNA probes [see Wong et al. (1988) Nucl. Acids Res. 16:11645-11661],including the mouse major satellite probe, were gifts from Dr. J. B.Ratter University of Calgary. Hamster chromosome painting was done withtotal hamster genomic DNA, and a cloned repetitive sequence specific tothe centromeric region of chromosome 2 [Fátyol et al. (1994) Nucl. AcidsRes. 22:3728-3736] was also used. Mouse chromosome painting was donewith a cloned long interspersed repetitive sequence [pMCP1.51] specificfor the mouse euchromatin.

[0288] For cotransfection and for in situ hybridization, the pCH 11 0β-galactosidase construct [Pharmacia or Invitrogen], and λcl 875 Sam7phage DNA [New England Biolabs] were used.

[0289] 2. Construction of Plasmid pPuroTel

[0290] Plasmid pPuroTel, which carries a Puromycin-resistance gene and acloned 2.5 kb human telomeric sequence [see SEQ ID No. 3], wasconstructed from the pBabe-puro retroviral vector [Morgenstern et al.(1990) Nucl. Acids Res. 18:3587-3596; provided by Dr. L. Székely(Microbiology and Tumorbiology Center, Karolinska Institutet,Stockholm); see, also Tonghua et al. (1995) Chin. Med. J. (Beijing,EngI. Ed.) 108:653-659; Couto et al. (1994) Infect. Immun. 62:2375-2378;Dunckley et al. (1992) FEBS Lett. 296:128-34; French et al. (1995) Anal.Biochem. 228:354-355; Liu et al. (1995) Blood 85:1095-1103;International PCT application Nos. WO 9520044; WO 9500178, and WO9419456].

[0291] F. Deposited cell lines

[0292] Cell lines KE1-2/4, EC3/7C5, TF1004G19C5, 19C5xHa4, G3D5 and H1D3 have been deposited in accord with the Budapest Treaty at theEuropean Collection of Animal Cell Culture (ECACC) under Accession Nos.96040924, 96040925, 96040926, 96040927, 96040928 and 96040929,respectively. The cell lines were deposited on Apr. 9, 1996, at theEuropean Collection of Animal Cell Cultures (ECACC) Vaccine Research andProduction Laboratory, Public Health Laboratory Service, Centre forAppliced Microbiology and Research, Porton Down, Salisbury, WiltshireSP4 OJG, United Kingdom. The deposits were made in the name of GyulaHadlaczky of H. 6723, SZEGED, SZAMOS U.1.A. IX. 36. HUNGARY, who hasauthorized reference to the deposited cell lines in this application.

EXAMPLE 2

[0293] Preparation of EC3/7, EC3/7C5 and Related Cell Lines

[0294] The EC3/7 cell line is an LMTK⁻ mouse cell line that contains theneo-centromere. The EC3/7C5 cell line is a single-cell subclone of EC3/7that contains the neo-minichromosome.

[0295] A. EC3/7 Cell line

[0296] As described in U.S. Pat. No. 5,288,625 [see, also Praznovszky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:11042-11046 and Hadlaczky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-8110] de novocentromere formation occurs in a transformed mouse LMTK⁻ fibro-blastcell line [EC3/7] after cointegration of λ constructs [λCM8 andλgtWESneo] carrying human and bacterial DNA.

[0297] By cotransfection of a 14 kb human DNA fragment cloned in λ[λCM8] and a dominant marker gene [λgtWESneo], a selectable centromerelinked to a dominant marker gene [neo-centromere] was formed in mouseLMTK-cell line EC3/7 [Hadlaczky et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:8106-8110, see FIG. 1]. Integration of the heterologous DNA[the λ DNA and marker gene-encoding DNA] occurred into the short arm ofan acrocentric chromosome [chromosome 7 (see, FIG. 1B)], where anamplification process resulted in the formation of the new centromere[neo-centromere (see FIG. 1C)]. On the dicentric chromosome (FIG. 1C),the newly formed centromere region contains all the heterologous DNA(human, λ, and bacterial) introduced into the cell and an activecentromere.

[0298] Having two functionally active centromeres on the same chromosomecauses regular breakages between the centromeres [see, FIG. 1E]. Thedistance between the two centromeres on the dicentric chromosome isestimated to be ˜10-15 Mb, and the breakage that separates theminichromosome occurred between the two centromeres. Such specificchromosome breakages result in the appearance [in approximately 1 0% ofthe cells] of a chromosome fragment that carries the neo-centromere[FIG. 1F]. This chromosome fragment is principally composed of human, λ,plasmid, and neomycin-resistance gene DNA, but it also has some mousechromosomal DNA. Cytological evidence suggests that during thestabilization of the MMCneo, there was an inverted duplication of thechromosome fragment bearing the neo-centromere. The size ofminichromosomes in cell lines containing the MMCneo is approximately20-30 Mb; this finding indicates a two-fold increase in size.

[0299] From the EC3/7 cell line, which contains the dicentric chromosome[FIG. 1E], two sublines [EC3/7C5 and EC3/7C6] were selected by repeatedsingle-cell cloning. In these cell lines, the neo-centromere was foundexclusively on a small chromosome [neo-minichromosome], while theformerly dicentric chromosome carried detectable amounts of theexogenously-derived DNA sequences but not an active neo-centromere [FIG.1F and 1G].

[0300] The minichromosomes of cell lines EC3/7C5 and EC3/7C6 aresimilar. No differences are detected in their architectures at eitherthe cytological or molecular level. The minichromosomes wereindistinguishable by conventional restriction endonuclease mapping or bylong-range mapping using pulsed field electrophoresis and Southernhybridization. The cytoskeleton of cells of the EC3/7C6 line showed anincreased sensitivity to colchicine, so the EC3/7C5 line was used forfurther detailed analysis.

[0301] B. Preparation of the EC317C5 and EC3/7C6 cell lines

[0302] The EC3/7C5 cells, which contain the neo-minichromosome, wereproduced by subcloning the EC3/7 cell line in high concentrations ofG418[40-fold the lethal dose] for 350 generations. Two singlecell-derived stable cell lines [EC3/7C5 and EC3/7C6] were established.

[0303] These cell lines carry the neo-centromere on minichromosomes andalso contain the remaining fragment of the dicentric chromosome.Indirect immunofluorescence with anti-centromere antibodies andsubsequent in situ hybridization experiments demonstrated that theminichromosomes derived from the dicentric chromosome. In EC3/7C5 andEC3/7C6 cell lines (140 and 128 metaphases, respectively) no intactdicentric chromosomes were found, and minichromosomes were detected in97.2% and 98.1% of the cells, respectively. The minichromosomes havebeen maintained for over 150 cell generations. They do contain theremaining portion of the formerly dicentric chromosome.

[0304] Multiple copies of telomeric DNA sequences were detected in themarker centromeric region of the remaining portion of the formerlydicentric chromosome by in situ hybridization. This indicates that mousetelomeric sequences were coamplified with the foreign DNA sequences.These stable minichromosome-carrying cell lines provide direct evidencethat the extra centromere is functioning and is capable of maintainingthe minichromosomes [see, U.S. Pat. No. 5,288,625].

[0305] The chromosome breakage in the EC3/7 cells, which separates theneo-centromere from the mouse chromosome, occurred in the G-band 25positive “foreign” DNA region. This is supported by the observation oftraces of λ and human DNA sequences at the broken end of the formerlydicentric chromosome. Comparing the G-band pattern of the chromosomefragment carrying the neo-centromere with that of the stableneo-minichromosome, reveals that the neo-minichromosome is an invertedduplicate of the chromosome fragment that bears the neo-centromere.

[0306] This is also evidenced by the observation that although theneo-minichromosome carries only one functional centromere, both ends ofthe minichromosome are heterochromatic, and mouse satellite DNAsequences were found in these heterochromatic regions by in situhybridization.

[0307] These two cell lines, EC3/7C5 and EC3/7C6, thus carry aselectable mammalian minichromosome [MMCneo] with a centromere linked toa dominant marker gene [Hadlaczky et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:8106-8110]. MMCneo is intended to be used as a vector forminichromosome-mediated gene transfer and has been used as model of aminichromosome-based vector system.

[0308] Long range mapping studies of the MMCneo indicated that human DNAand the neomycin-resistance gene constructs integrated into the mousechromosome separately, followed by the amplification of the chromosomeregion that contains the exogenous DNA. The MMCneo contains about 30-50copies of the λCM8 and λgtWESneo DNA in the form of approximately 1 60kb repeated blocks, which together cover at least a 3.5 Mb region. Inaddition to these, there are mouse telomeric sequences [Praznovszky etal. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:11042-11046] and any DNA ofmouse origin necessary for the correct higher-ordered structuralorganization of chromatids.

[0309] Using a chromosome painting probe mCPE1.51 [mouse longinterspersed repeated DNA], which recognizes exclusively euchromaticmouse DNA, detectable amounts of interspersed repeat sequences werefound on the MMCneo by in situ hybridization. The neo-centromere isassociated with a small but detectable amount of satellite DNA. Thechromosome breakage that separates the neo-centromere from the mousechromosome occurs in the “foreign” DNA region. This is demonstrated bythe presence of λ and human DNA at the broken end of the formerlydicentric chromosome. At both ends of the MMCneo, however, there aretraces of mouse major satellite DNA as evidenced by in situhybridization. This observation suggests that the doubling in size ofthe chromosome fragment carrying the neo-centromere during thestabilization of the MMCneo is a result of an inverted duplication.Although mouse telomere sequences, which coamplified with the exogenousDNA sequences during the neo-centromere formation, may providesufficient telomeres for the MMCneo, the duplication could have suppliedthe functional telomeres for the minichromosome.

[0310] The nucleotide sequence of portions of the neo-minichromosomeswas determined as follows. Total DNA was isolated from EC3/7C5 cellsaccording to standard procedures. The DNA was subjected to nucleic acidamplification using the Expand Long Template PCR system [BoehringerMannheim] according to the manufacturer's procedures. The amplificationprocedure required only a single 33-mer oligonucleotide primercorresponding to sequence in a region of the phage λ right arm, which iscontained in the neo-minichromosome. The sequence of thisoligonucleotide is set forth as the first 33 nucleotides of SEQ ID No.13. Because the neo-minichromosome contains a series of inverted repeatsof this sequence, the single oligonucleotide was used as a forward andreverse primer resulting in amplification of DNA positioned between setsof inverted repeats of the phage λ DNA. Three products were obtainedfrom the single amplification reaction, which suggests that the sequenceof the DNA located between different sets of inverted repeats maydiffer. In a repeating nucleic acid unit within an artificialchromosome, minor differences may be present and may occur duringculturing of cells containing the artificial chromosome. For example,base pair changes may occur as well as integration of mobile geneticelements and deletions of repeated sequences.

[0311] Each of the three products was subjected to DNA sequenceanalysis. The sequences of the three products are set forth in SEQ IDNos. 13, 14, and 15, respectively. To be certain that the sequencedproducts were amplified from the neo-minichromosome, controlamplifications were conducted using the same primers on DNA isolatedfrom negative control cell lines (mouse Ltk⁻ cells) lackingminichromosomes and the formerly dicentric chromosome, and positivecontrol cell lines [the mouse-hamster hybrid cell line GB43 generated bytreating 19C5xHa4 cells (see FIG. 4) with BrdU followed by growth inG418-containing selective medium and retreatment with BrdU] containingthe neo-minichromosome only. Only the positive control cell line yieldedthe three amplification products; no amplification product was detectedin the negative control reaction. The results obtained in the positivecontrol amplification also demonstrate that the neo-minichromosome DNA,and not the fragment of the formerly dicentric mouse chromosome, wasamplified.

[0312] The sequences of the three amplification products were comparedto those contained in the Genbank/EMBL database. SEQ ID Nos. 13 and 14showed high (˜96%) homology to portions of DNA from intracisternalA-particles from mouse. SEQ ID No. 15 showed no significant homologywith sequences available in the database. All three of these sequencesmay be used for generating gene targeting vectors as homologous DNAs tothe neo-minichromosome.

[0313] C. Isolation and partial purification of minichromosomes Mitoticchromosomes of EC3/7C5 cells were isolated as described by Hadlaczky etal. [(1 981) Chromosoma 81:537-555], using a glycine-hexylene glycolbuffer system [Hadlaczky et al. (1982) Chromosoma 86:643-659].Chromosome suspensions were centrifuged at 1,200×g for 30 minutes. Thesupernatant containing minichromosomes was centrifuged at 5,000×g for 30minutes and the pellet was resuspended in the appropriate buffer.Partially purified minichromosomes were stored in 50% glycerol at −20°C.

[0314] D. Stability of the MMCneo maintenance and neo expression

[0315] EC3/7C5 cells grown in non-selective medium for 284 days and thentransferred to selective medium containing 400 μg/ml G418 showed a 96%plating efficiency (colony formation) compared to control cells culturedpermanently in the presence of G418. Cytogenetic analysis indicated thatthe MMCneo is stably maintained at one copy per cell under selective andnon-selective culture conditions. Only two metaphases with two MMCneowere found in 2,270 metaphases analyzed.

[0316] Southern hybridization analysis showed no detectable changes inDNA restriction patterns, and similar hybridization intensities wereobserved with a neo probe when DNA from cells grown under selective ornon-selective culture conditions were compared.

[0317] Northern analysis of RNA transcripts from the neo gene isolatedfrom,cells grown under selective and non-selective conditions showedonly minor and not significant differences. Expression of the neo genepersisted in EC3/7C5 cells maintained in F-12 medium free of G418 for290 days under non-selective culture conditions. The long-termexpression of the neo gene(s) from the minichromosome may be -influencedby the nuclear location of the MMCneo. In situ hybridization experimentsrevealed a preferential peripheral location of the MMCneo in theinterphase nucleus. In more than 60% of the 2,500 nuclei analyses, theminichromosome was observed at the perimeter of the nucleus near thenuclear envelope.

EXAMPLE 3

[0318] Minichromosome Transfer and Production of the λ-neo-chromosome

[0319] A. Minichromosome transfer

[0320] The neo-minichromosome [referred to as MMCneo, FIG. 2C] has beenused for gene transfer by fusion of minichromosome-containing cells[EC3/7C5 or EC3/7C6] with different mammalian cells, including hamsterand human. Thirty-seven stable hybrid cell lines have been produced. Allestablished hybrid cell lines proved to be true hybrids as evidenced byin situ hybridization using biotinylated human, and hamster genomic, orpMCPE1.51 mouse long interspersed repeated DNA probes for “chromosomepainting”. The MMCneo has also been successfully transferred into mouseA9, L929 and pluripotent F9 teratocarcinoma cells by fusion ofmicrocells derived from EC3/7C5 cells. Transfer was confirmed by PCR,Southern blotting and in situ hybridization with minichromosome-specificprobes. The cytogenetic analysis confirmed that, as expected formicrocell fusion, a few cells [1-5%] received [or retained] the MMCneo.

[0321] These results demonstrate that the MMCneo is tolerated by a widerange of cells. The prokaryotic genes and the extra dosage for the humanand λ sequences carried on the minichromosome seem to be notdisadvantageous for tissue culture cells.

[0322] The MMCneo is the smallest chromosome of the EC3/7C5 genome andis estimated to be approximately 20-30 Mb, which is significantlysmaller than the majority of the host cell (mouse) chromosomes. Byvirtue of the smaller size, minichromosomes can be partially purifiedfrom a suspension of isolated chromosomes by a simple differentialcentrifugation. In this way, minichromosome suspensions of 15-20% purityhave been prepared. These enriched minichromosome preparations can beused to introduce, such as by microinjection or lipofection, theminichromosome into selected target cells. Target cells includetherapeutic cells that can be use in methods of gene therapy, and alsoembryonic cells for the preparation of transgenic (non-human) animals.

[0323] The MMCneo is capable of autonomous replication, is stablymaintained in cells, and permits persistent expression of the neogene(s), even after long-term culturing under non-selective conditions.It is a non-integrative vector that appears to occupy a territory nearthe nuclear envelope. Its peripheral localization in the nucleus mayhave an important role in maintaining the functional integrity andstability of the MMCneo. Functional compartmentalization of the hostnucleus may have an effect on the function of foreign sequences. Inaddition, MMCneo contains megabases of λ DNA sequences that should serveas a target site for homologous recombination and thus integration ofdesired gene(s) into the MMCneo. It can be transferred by cell andmicrocell fusion, microinjection, electroporation, lipid-mediatedcarrier systems or chromosome uptake. The neo-centromere of the MMCneois capable of maintaining and supporting the normal segregation of alarger 1 50-200 Mb λneo-chromosome. This result demonstrates that theMMCneo chromosome should be useful for carrying large fragments ofheterologous DNA.

[0324] B. Production of the λneo-chromosome

[0325] In the hybrid cell line KE1-2/4 made by fusion of EC3/7 andChinese hamster ovary cells [FIG 2], the separation of theneo-centromere from the dicentric chromosome was associated with afurther amplification process. This amplification resulted in theformation of a stable chromosome of average size [i.e., theλneo-chromosome; see, Praznovszky et al. (1991) Proc. Natl. Acad. Sci.U.S.A. 88:11042-11046]. The λneo-chromosome carries a terminally locatedfunctional centromere and is composed of seven large ampliconscontaining multiple copies of λ, human, bacterial, and mouse DNAsequences [see FIG. 2]. The amplicons are separated by mouse majorsatellite DNA [Praznovszky et al. (1991) Proc. Natl. Acad. Sci. U.S.A.88:11042-11046] which forms narrow bands of constitutive heterochromatinbetween the amplicons.

EXAMPLE 4

[0326] Formation of the “Sausage Chromosome” [SC]

[0327] The findings set forth in the above EXAMPLES demonstrate that thecentromeric region of the mouse chromosome 7 has the capacity forlarge-scale amplification [other results indicate that this capacity isnot unique to chromosome 7]. This conclusion is further supported byresults from cotransfection experiments, in which a second dominantselectable marker gene and a non-selected marker gene were introducedinto EC3/7C5 cells carrying the formerly dicentric chromosome 7 and theneo-minichromosome. The EC3/7C5 cell line was transformed with λ phageDNA, a hygromycin-resistance gene construct [pH1 32], and aβ-galactosidase gene construct [pCH110]. Stable transformants wereselected in the presence of high concentrations [400 μg/ml] HygromycinB, and analyzed by Southern hybridization. Established transformant celllines showing multiple copies of integrated exogenous DNA were studiedby in situ hybridization to localize the integration site(s), and byLacZ staining to detect, β-galactosidase expression.

[0328] A. Materials and methods

[0329] 1. Construction of pH132

[0330] The pH1 32 plasmid carries the hygromycin B resistance gene andthe anti-HIV-1 gag ribozyme [see, SEQ ID NO. 6 for DNA sequence thatcorresponds to the sequence of the ribozyme] under control of theβ-actin promoter. This plasmid was constructed from pHyg plasmid [Sugdenet al. (1985) Mol. Cell. Biol. 5:410-413; a gift from Dr. A. D. Riggs,Beckman Research Institute, Duarte; see, also, e.g., U.S. Pat. No.4,997,764], and from pPC-RAG12 plasmid [see, Chang et al. (1990) ClinBiotech 2:23-31; provided by Dr. J. J. Rossi, Beckman ResearchInstitute, Duarte; see, also U.S. Pat. Nos. 5,272,262, 5,149,796 and5,144,019, which describes the anti-HIV gag ribozyme and construction ofa mammalian expression vector containing the ribozyme insert linked tothe, β-actin promoter and SV40 late gene transcriptional termination andpolyA signals]. Construction of pPC-RAG12 involved insertion of theribozyme insert flanked by BamHI linkers was into BamHI-digestedPHβ-Apr-1gpt [see, Gunning et al. (1987) Proc. Natl. Acad. Sci. U.S.A.84:4831-4835, see, also U.S. Pat. No. 5,144,019].

[0331] Plasmid pH132 was constructed as follows. First, pPC-RAG12[described by Chang et al. (1990) Clin. Biotech. 2:23-31] was digestedwith BamHI to excise a fragment containing an anti-HIV ribozyme gene[referred to as ribozyme D by Chang et al. [(1990) Clin. Biotech.2:23-31]; see also U.S. Pat. No. 5,144,019 to Rossi et al., particularlyFIG. 4 of the patent] flanked by the human β-actin promoter at the 5′end of the gene and the SV40 late transcriptional termination andpolyadenylation signals at the 3′ end of the gene. As described by Changet al. [(1990) Clin. Biotech. 2:23-31], ribozyme D is targeted forcleavage of the translational initiation region of the HIV qag gene.This fragment of pPC-RAG12 was subcloned into pBluescript-KS(+)[Stratagene, La Jolla, Calif.] to produce plasmid 132. Plasmid 132 wasthen digested with XhoI and EcoRI to yield a fragment containing theribozyme D gene flanked by the ,β-actin promoter at the 5′ end and theSV40 termination and polyadenylation signals at the 3′ end of the gene.This fragment was ligated to the largest fragment generated by digestionof pHyg [Sugden et al. (1985) Mol. Cell. Biol. 5:410-413] with EcoRI andSalI to yield pH1 32. Thus, pH132 is an ˜-9.3 kb plasmid containing thefollowing elements: the β-actin promoter linked to an anti-HIV ribozymegene followed by the SV40 termination and polyadenylation signals, thethymidine kinase gene promoter linked to the hygromycin-resistance genefollowed by the thymidine kinase gene polyadenylation signal, and the E.coli CoIE1 origin of replication and the ampicillin-resistance gene.

[0332] The plasmid pHyg [see, e.g., U.S. Pat. Nos. 4,997,764, 4,686,186and 5,162,215], which confers resistance to hygromycin B usingtranscriptional controls from the HSV-1 tk gene, was originallyconstructed from pKan2 [Yates et al. (1 984) Proc. Natl. Acad. Sci.U.S.A. 81:3806-3810] and pLG89 [see, Gritz et al. (1983) Gene25:179-188]. Briefly pKan2 was digested with SmaI and BgIII to removethe sequences derived from transposon Tn5. The hygromycin-resistance hphgene was inserted into the digested pKan2 using blunt-end ligation atthe SnaI site and “sticky-end” ligation [using 1 Weiss unit of T4 DNAligase (BRL) in 20 microliter volume] at the BgIII site. The SmaI andBgIII sites of pKan2 were lost during ligation.

[0333] The resulting plasmid pH132, produced from introduction of theanti-HIV ribozyme construct with promoter and polyA site into pHyg,includes the anti-HIV ribozyme under control of the β-actin promoter aswell as the hygromycin-resistance gene under control of the TK promoter.

[0334] 2. Chromosome Banding

[0335] Trypsin G-banding of chromosomes was performed as described inEXAMPLE 1.

[0336]3. Cell cultures

[0337] TF1004G19 and TF1004G-19C5 mouse cells and the 19C5xHa4 hybrid,described below, and its sublines were cultured in F-1 2 mediumcontaining 400 μg/ml Hygromycin B [Calbiochem].

[0338] B. Cotransfection of EC3/7C5 to produce TF1004G19

[0339] Cotransfection of EC3/7C5 cells with plasmids [pH1 32, pCH1 10available from Pharmacia, see, also Hall et al. (1983) J. Mol. Appl.Gen. 2:101-109] and with λ DNA [λcl 875 Sam 7(New England Biolabs)] wasconducted using the calcium phosphate DNA precipitation method [see,e.g., Chen et al. (1987) Mol. Cell. Biol. 7:2745-2752], using 2-5 pgplasmid DNA and 20 μg λ phage DNA per 5×10⁶ recipient cells.

[0340] C. Cell lines containing the sausage chromosome

[0341] Analysis of one of the transformants, designated TF1004G19,revealed that it has a high copy number of integrated pH1 32 and pCH110sequences, and a high level of fi-galactosidase expression. G-bandingand in situ hybridization with a human probe [CM8; see, e.g., U.S.application Ser. No. 08/375,271] revealed unexpectedly that integrationhad occurred in the formerly dicentric chromosome 7 of the EC3/7C5 cellline. Furthermore, this chromosome carried a newly formedheterochromatic chromosome arm. The size of this heterochromatic armvaried between ˜150 and -800 Mb in individual metaphases.

[0342] By single cell cloning from the TF1004G19 cell line, a subcloneTF1004G-19C5 [FIG 2D], which carries a stable chromosome 7 with a˜100-150 Mb heterochromatic arm [the sausage chromosome] was obtained.This cell line has been deposited in the ECACC under Accession No.96040926. This chromosome arm is composed of four to five satellitesegments rich in satellite DNA, and evenly spaced integratedheterologous “foreign” DNA sequences. At the end of the compactheterochromatic arm of the sausage chromosome, a less condensedeuchromatic terminal segment is regularly observed. This subclone wasused for further analyses.

[0343] D. Demonstration that the sausage chromosome is derived from theformerly dicentric chromosome

[0344] In situ hybridization with λ phage and pH132 DNA on theTF1004G-19C5 cell line showed positive hybridization only on theminichromosome and on the heterochromatic arm of the “sausage”chromosome [FIG. 2D]. It appears that the “sausage” chromosome [hereinalso referred to as the SC] developed from the formerly dicentricchromosome (FD) of the EC3/7C5 cell line.

[0345] To establish this, the integration sites of pCH1 10 and pH1 32plasmids were determined. This was accomplished by in situ hybridizationon these cells with biotin-labeled subfragments of thehygromycin-resistance gene and the β-galactosidase gene. Bothexperiments resulted in narrow hybridizing bands on the heterochromaticarm of the sausage chromosome. The same hybridization pattern wasdetected on the sausage chromosome using a mixture of biotin-labeled λprobe and pH 132 plasmid, proving the cointegration of λ phages, pH 132and pCH110 plasmids.

[0346] To examine this further, the cells were cultured in the presenceof the DNA-binding dye Hoechst 33258. Culturing of mouse cells in thepresence of this dye results in under-condensation of the pericentricheterochromatin of metaphase chromosomes, thereby permitting betterobservation of the hybridization pattern. Using this technique, theheterochromatic arm of the sausage chromosome of TF1004G-19C5 cellsshowed regular under-condensation revealing the details of the structureof the “sausage” chromosome by in situ hybridization. Results of in situhybridization on Hoechst-treated TF1004G-19C5 cells with biotin-labeledsubfragments of hygromycin-resistance and β-galactosidase genes showsthat these genes are localized only in the heterochromatic arm of thesausage chromosome. In addition, an equal banding hybridization patternwas observed. This pattern of repeating units [amplicons] clearlyindicates that the sausage chromosome was formed by an amplificationprocess and that the λ phage, pH 132 and pCH110 plasmid DNA sequencesborder the amplicons.

[0347] In another series of experiments using fluorescence in situhybridization [FISH] carried out with mouse major satellite DNA, themain component of the mouse pericentric heterochromatin, the resultsconfirmed that the amplicons of the sausage chromosome are primarilycomposed of satellite DNA.

[0348] E. The sausage chromosome has one centromere

[0349] To determine whether mouse centromeric sequences had participatedin the amplification process forming the “sausage” chromosome andwhether or not the amplicons carry inactive centromeres, in situhybridization was carried out with mouse minor satellite DNA. Mouseminor satellite DNA is localized specifically near the centromeres ofall mouse chromosomes. Positive hybridization was detected in all mousecentromeres including the sausage chromosome, which, however, onlyshowed a positive signal at the beginning of the heterochromatic arm.

[0350] Indirect immunofluorescence with a human anti-centromereanti-body [LU 851 which recognizes only functional centromeres [see,e.g, Hadlaczky et al (1 989) Chromosoma 97:282-288] proved that thesausage chromosome has only one active centromere. The centromere comesfrom the formerly dicentric part of the chromosome and co-localizes withthe in situ hybridization signal of the mouse minor DNA probe.

[0351] F. The selected and non-selected heterologous DNA in theheterochromatin of the sausage chromosome is expressed

[0352] 1. High Levels of the Heterologous Genes are Expressed

[0353] The TF1004G-19C5 cell line thus carries multiple copies ofhygromycin-resistance and β-galactosidase genes localized only in theheterochromatic arm of the sausage chromosome. The TF1004G-19C5 cellscan grow very well in the presence of 200 μg/ml or even 400 μg/mlhygromycin B. [The level of expression was determined by Northernhybridization with a subfragment of the hygromycin-resistance gene andsingle copy gene.]

[0354] The expression of the non-selected β-galactosidase gene in theTF1004G-19C5 transformant was detected with LacZ staining of the cells.By this method one hundred percent of the cells stained dark blue,showing that there is a high level of β-galactosidase expression in allof TF1004G-19C5 cells.

[0355] 2. The Heterologous Genes that are Expressed are in theHeterochromatin of the Sausage Chromosome

[0356] To demonstrate that the genes localized in the constitutiveheterochromatin of the sausage chromosome provide the hygromycinresistance and the LacZ staining capability of TF1004G-19C5transformants [i.e. β-gal expression], PEG-induced cell fusion betweenTF1004G-19C5 mouse cells and Chinese hamster ovary cells was performed.The hybrids were selected and maintained in HAT medium containingG418[400 μg/ml] and hygromycin [200 μg/ml]. Two hybrid clones designated19C5xHa3 and 19C5xHa4, which have been deposited in the ECACC underAccession No. 96040927, were selected. Both carry the sausage chromosomeand the minichromosome.

[0357] Twenty-seven single cell derived colonies of the 19C5xHa4 hybridwere maintained and analyzed as individual subclones. In situhybridization with hamster and mouse chromosome painting probes andhamster chromosome 2-specific probes verified that the 19C5xHa4 clonecontains the complete Chinese hamster genome and a partial mouse genome.All 19C5xHa4 subclones retained the hamster genome, but differentsubclones showed different numbers of mouse chromosomes indicating thepreferential elimination of mouse chromosomes.

[0358] To promote further elimination of mouse chromosomes, hybrid cellswere repeatedly treated with BrdU. The BrdU treatments, whichdestabilize the genome, result in significant loss of mouse chromosomes.The BrdU-treated 19C5xHa4 hybrid cells were divided to three groups. Onegroup of the hybrid cells (GH) were maintained in the presence ofhygromycin (200 μg/ml) and G418 (400 μg/ml), and the other two groups ofthe cells were cultured under G418 (G) or hygromycin (H) selectionconditions to promote the elimination of the sausage chromosome orminichromosome.

[0359] One month later, single cell derived subclones were establishedfrom these three subcultures of the 1 9C5xHa4 hybrid line. The subcloneswere monitored by in situ hybridization with biotin-labeled λ phage andhamster chromosome painting probes. Four individual clones [G2B5, G3C5,G4D6, G2B4] selected in the presence of G418 that had lost the sausagechromosome but retained the minichromosome were found. Under hygromycinselection only one subclone [H1 D3] lost the minichromosome. In thisclone the megachromosome [see Example 5] was present.

[0360] Since hygromycin-resistance and β-galactosidase genes werethought to be expressed from the sausage chromosome, the expression ofthese genes was analyzed in the four subclones that had lost the sausagechromosome. In the presence of 200 μg/ml hygromycin, one hundred percentof the cells of four individual subclones died. In order to detect theβ-galactosidase expression hybrid, subclones were analyzed by LacZstaining. One hundred percent of the cells of the four subclones thatlost the sausage chromosome also lost the LacZ staining capability. Allof the other hybrid subclones that had not lost the sausage chromosomeunder the non-selective culture conditions showed positive LacZstaining.

[0361] These findings demonstrate that the expression ofhygromycin-resistance and β-galactosidase genes is linked to thepresence of the sausage chromosome. Results of in situ hybridizationsshow that the heterologous DNA is expressed from the constitutiveheterochromatin of the sausage chromosome.

[0362] In situ hybridization studies of three other hybrid subclones[G2C6, G2D1, and G4D5] did not detect the presence of the sausagechromosome. By the LacZ staining method, some stained cells weredetected in these hybrid lines, and when these subclones weretransferred to hygromycin selection some colonies survived. Cytologicalanalysis and in situ hybridization of these hygromycin-resistantcolonies revealed the presence of the sausage chromosome, suggestingthat only the cells of G2C6, G2D1 and G4D5 hybrids that had not lost thesausage chromosome were able to preserve the hygromycin resistance andβ-galactosidase expression. These results confirmed that the expressionof these genes is linked to the presence of the sausage chromosome. Thelevel of β-galactosidase expression was determined by the immunoblottechnique using a monoclonal antibody.

[0363] Hygromycin resistance and β-galactosidase expression of the cellswhich contained the sausage chromosome were provided by the geneslocalized in the mouse pericentric heterochromatin. This wasdemonstrated by performing Southern DNA hybridizations on the hybridcells that lack the sausage chromosome using PCR-amplified subfragmentsof hygromycin-resistance and β-galactosidase genes as probes. None ofthe subclones showed hybridization with these probes; however, all ofthe analyzed clones contained the minichromosome. Other hybrid clonesthat contain the sausage chromosome showed intense hybridization withthese DNA probes. These results lead to the conclusion that hygromycinresistance and β-galactosidase expression of the cells that contain thesausage chromosome were provided by the genes localized in the mousepericentric heterochromatin.

EXAMPLE 5

[0364] The Gigachromosome

[0365] As described in Example 4, the sausage chromosome was transferredinto Chinese hamster cells by cell fusion. Using Hygromycin B/HAT andG418 selection, two hybrid clones 1 9C5xHa3 and 1 9C5xHa4 were producedthat carry the sausage chromosome. In situ hybridization, using hamsterand mouse chromosome-painting probes and a hamster chromosome 2-specificprobe, verified that clone 1 9C5xHa4 contains a complete Chinese hamstergenome as well as partial mouse genomes. Twenty-seven separate coloniesof 19C5xHa4 cells were maintained and analyzed as individual subclones.Twenty-six out of 27 subclones contained a morphologically unchangedsausage chromosome. -r In one subclone of the 1 9C5xHa3 cell line, 19C5xHa47 [see FIG. 2E], the heterochromatic arm of the sausagechromosome became unstable and showed continuous intrachromosomalgrowth. In extreme cases, the amplified chromosome arm exceeded 1000 Mbin size (gigachromosome).

EXAMPLE 6

[0366] The Stable Megachromosome =p A. Generation of cell linescontaining the megachromosome All 1 9C5xHa4 subclones retained acomplete hamster genome, but different subclones showed differentnumbers of mouse chromosomes, indicating the preferential elimination ofmouse chromosomes. As described in Example 4, to promote furtherelimination of mouse chromosomes, hybrid cells were treated with BrdU,cultured under G418 (G) or hygromycin (H) selection conditions followedby repeated treatment with 10⁻⁴ M BrdU for 16 hours and single cellsubclones were established. The BrdU treatments appeared to destabilizethe genome, resulting in a change in the sausage chromosome as well. Agradual increase in a cell population in which a further amplificationhad occurred was observed. In addition to the ˜100-150 Mbheterochromatic arm of the sausage chromosome, an extra centromere and a˜150-250 Mb heterochromatic chromosome arm were formed, which differedfrom those of mouse chromosome 7. By the acquisition of anothereuchromatic terminal segment, a new submetacentric chromosome(megachromosome) was formed. Seventy-nine individual subclones wereestablished from these BrdU-treated cultures by single-cell cloning: 42subclones carried the intact megachromosome, 5 subclones carried thesausage chromosome, and in 32 subclones fragments or translocatedsegments of the megachromosome were observed. Twenty-six subclones thatcarried the megachromosome were cultured under non-selective conditionsover a two-month period. In 19 out of 26 subclones, the megachromosomewas retained. Those subclones which lost the megachromosomes all becamesensitive to Hygromycin B and had no β-galactosidase expression,indicating that both markers were linked to the megachromosome.

[0367] Two sublines (G3D5 and H1D3), which were chosen for furtherexperiments, showed no changes in the morphology of the megachromosomeduring more than 100 generations under selective conditions. The G3D5cells had been obtained by growth of 19C5xHa4 cells in G41 8-containingmedium followed by repeated BrdU treatment, whereas H1D3 cells had beenobtained by culturing 19C5xHa4 cells in hygromycin-containing mediumfollowed by repeated BrdU treatment.

[0368] B. Structure of the megachromosome

[0369] The following results demonstrate that, apart from theeuchromatic terminal segments, the integrated foreign DNA (and as in theexemplified embodiments, rDNA sequence), the whole megachromosome isconstitutive heterochromatin, containing a tandem array of at least 40[˜7.5 Mb] blocks of mouse major satellite DNA [see FIGS. 2 and 3]. Foursatellite DNA blocks are organized into a giant palindrome [amplicon]carrying integrated exogenous DNA sequences at each end. The long andshort arms of the submetacentric megachromosome contains 6 and 4amplicons, respectively. It is of course understood that the specificorganization and size of each component can vary among species, and alsothe chromosome in which the amplification event initiates.

[0370] 1. The Megachromosome is Composed Primarily of Heterochromatin

[0371] Except for the terminal regions and the integrated foreign DNA,the megachromosome is composed primarily of heterochromatin. This wasdemonstrated by C-banding of the megachromosome, which resulted inpositive staining characteristic of constitutive heterochromatin. Apartfrom the terminal regions and the integrated foreign DNA, the wholemegachromosome appears to be heterochromatic. Mouse major satellite DNAis the main component of the pericentric, constitutive heterochromatinof mouse chromosomes and represents ˜10% of the total DNA [Waring et al.(1966) Science 154:791-794]. Using a mouse major satellite DNA probe forin situ hybridization, strong hybridization was observed throughout themegachromosome, except for its terminal regions. The hybridizationshowed a segmented pattern: four large blocks appeared on the short armand usually 4-7 blocks were seen on the long arm. By comparing thesesegments with the pericentric regions of normal mouse chromosomes thatcarry ˜15 Mb of major satellite DNA, the size of the blocks of majorsatellite DNA on the megachromosome was estimated to be ˜30 Mb.

[0372] Using a mouse probe specific to euchromatin [pMCPE1.51; a mouselong interspersed repeated DNA probe], positive hybridization wasdetected only on the terminal segments of the megachromosome of the H1D3hybrid subline. In the G3D5 hybrids, hybridization with ahamster-specific probe revealed that several megachromosomes containedterminal segments of hamster origin on the long arm. This observationindicated that the acquisition of the terminal segments on thesechromosomes happened in the hybrid cells, and that the long arm of themegachromosome was the recently formed one arm. When a mouse minorsatellite probe was used, specific to the centromeres of mousechromosomes [Wong et al (1988) Nucl. Acids Res. 16:11645-11661], astrong hybridization signal was detected only at the primaryconstriction of the megachromosome, which colocalized with the positiveimmunofluorescence signal produced with human anti-centromere serum[LU851].

[0373] In situ hybridization experiments with pH132, pCH110, and λ DNAprobes revealed that all heterologous DNA was located in the gapsbetween the mouse major satellite DNA segments. Each segment of mousemajor satellite DNA was bordered by a narrow band of integratedheterologous DNA, except at the second segment of the long arm where adouble band of heterologous DNA existed, indicating that the majorsatellite DNA segment was missing or considerably reduced in size here.This chromosome region served as a useful cytological marker inidentifying the long arm of the megachromosome. At a frequency of 10⁻⁴,“restoration” of these missing satellite DNA blocks was observed in onechromatid, when the formation of a whole segment on one chromatidoccurred.

[0374] After Hoechst 33258 treatment (50 μg/ml for 16 hours), themegachromosome showed undercondensation throughout its length except forthe terminal segments. This made it possible to study the architectureof the megachromosome at higher resolution. In situ hybridization withthe mouse major satellite probe on undercondensed megachromosomesdemonstrated that the ˜30 Mb major satellite segments were composed offour blocks of ˜7.5 Mb separated from each other by a narrow band ofnon-hybridizing sequences [FIG. 3]. Similar segmentation can be observedin the large block of pericentric heterochromatin in metacentric mousechromosomes from the LMTK- and A9 cell lines.

[0375] 2. The Megachromosome is Composed of Segments Containing TwoTandem ˜7.5 Mb Blocks followed by Two Inverted Blocks

[0376] Because of the asymmetry in thymidine content between the twostrands of the DNA of the mouse major satellite, when mouse cells aregrown in the presence of BrdU for a single S phase, the constitutiveheterochromatin shows lateral asymmetry after FPG staining. Also, in the19C5xHa4 hybrids, the thymidine-kinase [Tk] deficiency of the mousefibroblast cells was complemented by the hamster Tk gene, permittingBrdU incorporation experiments.

[0377] A striking structural regularity in the megachromosome wasdetected using the FPG technique. In both chromatids, alternating darkand light staining that produced a checkered appearance of themegachromosome was observed. A similar picture was obtained by labellingwith fluorescein-conjugated anti-BrdU antibody. Comparing these picturesto the segmented appearance of the megachromosome showed that one darkand one light FPG band corresponded to one ˜30 Mb segment of themegachromosome. These results suggest that the two halves of the ˜30 Mbsegment have an inverted orientation. This was verified by combining insitu hybridization and immunolabelling of the incorporated BrdU withfluorescein-conjugated anti-BrdU antibody on the same chromosome. Sincethe ˜30 Mb segments [or amplicons] of the megachromosome are composed offour blocks of mouse major satellite DNA, it can be concluded that twotandem ˜7.5 Mb blocks are followed by two inverted blocks within onesegment.

[0378] Large-scale mapping of megachromosome DNA by pulsed-fieldelectrophoresis and Southern hybridization with “foreign” DNA probesrevealed a simple pattern of restriction fragments. Using endonucleaseswith none, or only a single cleavage site in the integrated foreign DNAsequences, followed by hybridization with a hyg probe, 1-4 predominantfragments were detected. Since the megachromosome contains 10-12amplicons with an estimated 3-8 copies of hyg sequences per amplicon(30-90 copies per megachromosome), the small number of hybridizingfragments indicates the homogeneity of DNA in the amplified segments.

[0379] 3. Scanning Electron Microscopy of the Megachromosome Confirmedthe Above Findings

[0380] The homogeneous architecture of the heterochromatic arms of themegachromosome was confirmed by high resolution scanning electronmicroscopy. Extended arms of megachromosomes, and the pericentricheterochromatic region of mouse chromosomes, treated with Hoechst 33258,showed similar structure. The constitutive heterochromatic regionsappeared more compact than the euchromatic segments. Apart from theterminal regions, both arms of the megachromosome were completelyextended, and showed faint grooves, which should correspond to theborder of the satellite DNA blocks in the non-amplified chromosomes andin the megachromosome. Without Hoechst treatment, the grooves seemed tocorrespond to the amplicon borders on the megachromosome arms. Inaddition, centromeres showed a more compact, finely fibrous appearancethan the surrounding heterochromatin.

[0381] 4. The Megachromosome of 1B3 Cells Contains rRNA Gene Sequence

[0382] The sequence of the megachromosome in the region of the sites ofintegration of the heterologous DNA was investigated by isolation ofthese regions through using cloning methods and sequence analysis of theresulting clones. The results of this analysis revealed that theheterologous DNA was located near mouse ribosomal RNA gene (i.e., rDNA)sequences contained in the megachromosome.

[0383] a. Cloning of regions of the megachromosomes in whichheterologous DNA had integrated

[0384] Megachromosomes were isolated from 1 B3 cells (which weregenerated by repeated BrdU treatment and single cell cloning of H1xHE41cells (see FIG. 4) and which contain a truncated megachromosome) usingfluorescence-activated cell sorting methods as described herein (seeExample 10). Following separation of the SATACs (megachromosomes) fromthe endogenous chromosomes, the isolated megachromosomes were stored inGH buffer (100 mM glycine, 1% hexylene glycol, pH 8.4-8.6 adjusted withsaturated calcium hydroxide solution; see Example 10) and centrifugedinto an agarose bed in 0.5 M EDTA.

[0385] Large-scale mapping of the megachromosome around the area of thesite of integration of the heterologous DNA revealed that it is enrichedin sequence containing rare-cutting enzyme sites, such as therecognition site for NotI. Additionally, mouse major satellite DNA(which makes up the majority of the megachromosome) does not containNotI recognition sites. Therefore, to facilitate isolation of regions ofthe megachromosome associated with the site of integration of theheterologous DNA, the isolated megachromosomes were cleaved with NotI, arare cutting restriction endonuclease with an 8-bp GC recognition site.Fragments of the megachromosome were inserted into plasmid pWE15(Stratagene, La Jolla, Calif.) as follows. Half of a 100-μl low meltingpoint agarose block (mega-plug) containing the isolated SATACs wasdigested with NotI overnight at 37° C. Plasmid pWE1 5 was similarlydigested with NotI overnight. The mega-plug was then melted and mixedwith the digested plasmid, ligation buffer and T4 ligase. Ligation wasconducted at 1 6° C. overnight. Bacterial DH5a cells were transformedwith the ligation product and transformed cells were plated onto LB/Ampplates. Fifteen to twenty colonies were grown on each plate for a totalof 189 colonies. Plasmid DNA was isolated from colonies that survivedgrowth on LB/Amp medium and was analyzed by Southern blot hybridizationfor the presence of DNA that hybridized to a pUC1 9 probe. Thisscreening methodology assured that all clones, even clones lacking aninsert but yet containing the pWE15 plasmid, would be detected. Anyclones containing insert DNA would be expected to contain containnon-satellite, GC-rich megachromosome DNA sequences located at the siteof integration of the heterologous DNA. All colonies were positive forhybridizing DNA.

[0386] Liquid cultures of all 189 transformants were used to generatecosmid minipreps for analysis of restriction sites within the insertDNA. Six of the original 189 cosmid clones conatained an insert. Theseclones were designated as follows: 28 (˜9-kb insert), 30 (˜9-kb insert),60 (˜4-kb insert), 113 (˜9-kb insert), 157 (˜9-kb insert) and 161 (˜9-kbinsert). Restriction enzyme analysis indicated that three of the clones(113, 157 and 161) contained the same insert.

[0387] b. In situ hybridization experiments using isolated segments ofthe megachromosome as probes

[0388] Insert DNA from clones 30, 113, 157 and 161 was purified, labeledand used as probes in in situ hybridization studies of several celllines. Counterstaining of the cells with propidium iodide facilitatedidentification of the cytological sites of the hybridization signals.The locations of the signals detected within the cells are summarized inthe following table: CELL TYPE PROBE LOCATION OF SIGNAL Human LymphocyteNo. 161 4-5 pairs of acrocentic chromosomes (male) at centromericregions. Mouse Spleen No. 161 Acrocentric ends of 4 pairs ofchromosomes. EC3/7C5 Cells No. 161 Minichromosome and the end of theformerly dicentric chromosome. Pericentric heterochromatin of one of themetacentric mouse chromosomes. Centromeric region of some of the othermouse chromosomes. K20 No. 30 Ends of at least 6 pairs of ChineseHamster chromosomes. An interstitial signal Cells on a short chromosome.HB31 Cells No. 30 Acrocentric ends of at least 12 pairs (mouse-hamsterhybrid of chromosomes. Centromeres of cells derived from H1D3 certainchromosomes and the cells by repeated BrdU megachromosome. Borders ofthe treatment and single amplicons of the megachromosome. cell cloningwhich carries the megachromosome) Mouse Spleen Cells No. 30 Similar tosignal observed for probe no. 161. Centromeres of 5 pairs ofchromosomes. Weak cross- hybridization to pericentric heterochromatin.HB31 Cells No. 113 Similar to signal observed for probe no. 30. MouseSpleen Cells No. 113 Centromeric region of 5 pairs of chromosomes. K20Cells No. 113 At least 6 pairs of chromosomes. Weak signal at sometelomeres and several interspersed signals. Human Lymphocyte No. 157Similar to signal observed for probe Cells (male) no. 161.

[0389] C. Southern Blot Hybridization using Isolated Segments of theMegachromosome as Probes

[0390] DNA was isolated from mouse spleen tissue, mouse LMTK⁻ cells, K20Chinese hamster ovary cells, EJ30 human fibroblast cells and H1D3 cells.The isolated DNA and lambda phage DNA, was subjected to Southern blothybridization using inserts isolated from megachromosome clone nos. 30,113, 157 and 161 as probes. Plasmid pWE15 was used as a negative controlprobe. Each of the four megachromosome clone inserts hybridized in amulti-copy manner (as demonstrated by the intensity of hybridization andthe number of hybridizing bands) to all of the DNA samples, except thelambda phage DNA. Plasmid pWE15 hybridized to lambda DNA only.

[0391] d. Sequence analysis of megachromosome clone no. 161

[0392] Megachromosome clone no. 161 appeared to show the strongesthybridization in the in situ and Southern hybridization experiments andwas chosen for analysis of the insert sequence. The sequence analysiswas approached by first subcloning the insert of cosmid clone no. 161 toobtain five subclones as follows.

[0393] To obtain the end fragments of the insert of clone no. 161, theclone was digested with NotI and BamHI and ligated withNotI/BamHI-digested pBluescript KS (Stratagene, La Jolla, Calif.). Twofragments of the insert of clone no. 161 were obtained: a 0.2-kb and a0.7-kb insert fragment. To subclone the internal fragment of the insertof clone no. 161, the same digest was ligated with BamHI-digested pUC19.Three fragments of the insert of clone no. 161 were obtained: a 0.6-kb,a 1.8-kb and a 4.8-kb insert fragment.

[0394] The ends of all the subcloned insert fragments were firstsequenced manually. However, due to their extremely high GC content,autoradiographs were difficult to interpret and sequencing was repeatedusing an ABI sequencer and the dye-terminator cycle protocol. Acomparison of the sequence data to sequences in the GENBANK databaserevealed that the insert of clone no. 161 corresponds to an internalsection of the mouse ribosomal RNA gene (rDNA) repeat unit betweenpositions 7551-15670 as set forth in GENBANK accession no. X82564, whichis provided as SEQ ID NO. 16 herein. The sequence data obtained for theinsert of clone no. 161 is set forth in SEQ ID NOS. 18-24. Specifically,the individual subclones corresponded to the following positions inGENBANK accession no. X82564 (i.e., SEQ ID NO. 16) and in SEQ ID NOs.18-24: Subclone Start End Site SEQ ID No. in X82564 161k1 7579 7755NotI, BamHI 18 161m5 7756 8494 BamHI 19 161m7 8495 10231 BamHI 20 (showsonly sequence corresponding to nt. 8495- 8950), 21 (shows only sequencecorresponding to nt. 9851-10231) 161m12 10232 15000 BamHI 22 (shows onlysequence corresponding to nt. 10232- 10600), 23 (shows only sequencecorresponding to nt. 14267-15000), 161k2 15001 15676 NotI, BamHI 24

[0395] The sequence set forth in SEQ ID NOs. 18-24 diverges in somepositions from the sequence presented in positions 7551-15670 of GENBANKaccession no. X82564. Such divergence may be attributable to randommutations between repeat units of rDNA. The results of the sequenceanalysis of clone no. 161, which reveal that it corresponds to rDNA,correlate with the appearance of the in situ hybridization signal itgenerated in human lymphocytes and mouse spleen cells. The hybridizationsignal was clearly observed on acrocentric chromosomes in these cells,and such types of chromosomes are known to include rDNA adjacent to thepericentric satellite DNA on the short arm of the chromosome.Furthermore, rRNA genes are highly conserved in mammals as supported bythe cross-species hybridization of clone no. 161 to human chromosomalDNA.

[0396] To isolate amplification-replication control regions such asthose found in rDNA, it may be possible to subject DNA isolated frommegachromosome-containing cells, such as H1D3 cells, to nucleic acidamplification using, e.g., the polymerase chain reaction (PCR) with thefollowing primers:

[0397] amplification control element forward primer (1-30)

[0398] 5′-GAGGAATTCCCCATCCCTAATCCAGATTGGTG-3′ (SEQ ID NO. 25)

[0399] amplification control element reverse primer (2142-2111)

[0400] 5′-AAACTGCAGGCCGAGCCACCTCTCTTCTGTGTTTG-3′ (SEQ ID NO. 26)

[0401] origin of replication region forward primer (2116-2141)

[0402] 5′-AGGAATTCACAGAAGAGAGGTGGCTCGGCCTGC-3′ (SEQ ID NO. 27)

[0403] origin of replication region reverse primer (5546-5521)

[0404] 5′-AGCCTGCAGGAAGTCATACCTGGGGAGGTGGCCC-3′ (SEQ ID NO. 28)

[0405] C. Summary of the formation of the megachromosome

[0406]FIG. 2 schematically sets forth events leading to the formation ofa stable megachromosome beginning with the generation of a dicentricchromosome in a mouse LMTK⁻ cell line: (A) A single E-type amplificationin the centromeric region of the mouse chromosome 7 followingtransfection of LMTK⁻ cells with λCM8 and λgtWESneo generates theneo-centromere linked to the integrated foreign DNA, and forms adicentric chromosome. Multiple E-type amplification forms theλneo-chromosome, which was derived from chromosome 7 and stabilized in amouse-hamster hybrid cell line; (B) Specific breakage between thecentromeres of a dicentric chromosome 7 generates a chromosome fragmentwith the neo-centromere, and a chromosome 7 with traces of foreign DNAat the end; (C) Inverted duplication of the fragment bearing theneo-centromere results in the formation of a stable neo-minichromosome;(D) Integration of exogenous DNA into the foreign DNA region of theformerly dicentric chromosome 7 initiates H-type amplification, and theformation of a heterochromatic arm. By capturing a euchromatic terminalsegment, this new chromosome arm is stabilized in the form of the“sausage” chromosome; (E) BrdU treatment and/or drug selection appearsto induce further H-type amplification, which results in the formationof an unstable gigachromosome: (F) Repeated BrdU treatments and/or drugselection induce further H-type amplification including a centromereduplication, which leads to the formation of another heterochromaticchromosome arm. It is split off from the chromosome 7 by chromosomebreakage and acquires a terminal segment to form the stablemegachromosome.

[0407] D. Expression of β-galactosidase and hygromycin transferase genesin cell lines carrying the megachromosome or derivatives thereof

[0408] The level of heterologous gene (i.e., β-galactosidase andhygromycin transferase genes) expression in cell lines containing themegachromosome or a derivative thereof was quantitatively measured. Therelationship between the copy-number of the heterologous genes and thelevel of protein expressed therefrom was also determined.

[0409] 1. Materials and Methods

[0410] a. Cell lines

[0411] Heterologous gene expression levels of H1D3 cells, carrying a250-400 Mb megachromosome as decribed above, and mM2C1 cells, carrying a50-60 Mb micro-megachromosome, were quantitatively evaluated. mM2C1cells were generated by repeated BrdU treatment and single cell cloningof the H1xHe41cell line (mouse-hamster-human hybrid cell line carryingthe megachromosome and a single human chromosome with CD4 and neo^(r)genes; see FIG. 4). The cell lines were grown under standard conditionsin F12 medium under selective (120 μg/ml hygromycin) or non-selectiveconditions.

[0412] b. Preparation of cell extract for β-galactosidase assays

[0413] Monolayers of mM2C1 or H1D3 cell cultures were washed three timeswith phosphate-buffered saline (PBS). Cells were scraped by rubberpolicemen and suspended and washed again in PBS. Washed cells wereresuspended into 0.25 M Tris-HCl, pH 7.8, and disrupted by three cyclesof freezing in liquid nitrogen and thawing at 37° C. The extract wasclarified by centrifugation at 12,000 rpm for 5 min. at 4° C.

[0414] C. β-galacosidase assay

[0415] The β-galactosidase assay mixture contained 1 mM MgC1₂, 45 mMβ-mercaptoethanol, 0.8 mg/ml o-nitrophenyl-β-D-galactopyranoside and 66mM sodium phosphate, pH 7.5. After incubating the reaction mixture withthe cell extract at 37° C. for increasing time, the reaction wasterminated by the addition of three volumes of 1M Na₂CO₃, and theoptical density was measured at 420 nm. Assay mixture incubated withoutcell extract was used as a control. The linear range of the reaction wasdetermined to be between 0.1-0.8 OD₄₂₀. One unit of β-galactosidaseactivity is defined as the amount of enzyme that will hydrolyse 3 nmolesof o-nitrophenyl-β-D-galactopyranoside in 1 minute at 37° C.

[0416] d. Preparation of cell extract for hygromycin phosphotransferaseassay

[0417] Cells were washed as described above and resuspended into 20 mMHepes buffer, pH 7.3, 100 mM potassium acetate, 5 mM Mg acetate and 2 mMdithiothreitol). Cells were disrupted at 0° C. by six 10 sec bursts inan MSE ultrasonic disintegrator using a microtip probe. Cells wereallowed to cool for 1 min after each ultrasonic burst. The extracts wereclarified by centrifuging for 1 min at 2000 rpm in a microcentrifuge.

[0418] e. Hygromycin phosphotransferase assay

[0419] Enzyme activity was measured by means of the phosphocellulosepaper binding assay as described by Haas and Dowding [(1975). Meth.Enzymol. 43:611-628]. The cell extract was upplemented with 0.1 Mammonium chloride and 1 mM adenosine-γ-³²P-triphosphate (specificactivity: 300 Ci/mmol). The reaction was initiated by the addition of0.1 mg/ml hygromycin and incubated for increasing time at 37° C. Thereaction was terminated by heating the samples for 5 min at 75° C. in awater bath, and after removing the precipitated proteins bycentrifugation for 5 min in a microcentrifuge, an aliquot of thesupernatant was spotted on a piece of Whatman P-81 phosphocellulosepaper (2 cm²). After 30 sec at room temperature the papers are placedinto 500 ml of hot (75° C.) distilled water for 3 min. While theradioactive ATP remains in solution under these conditions, hygromycinphosphate binds strongly and quantitatively to phosphocellulose. Thepapers are rinsed 3 times in 500 ml of distilled water and the boundradioactivity was measured in toluene scintillation cocktail in aBeckman liquid scintillation counter. Reaction mixture incubated withoutadded hygromycin served as a control.

[0420] f. Determination of the copy-number of the heterologous genes

[0421] DNA was prepared from the H1D3 and mM2C1 cells using standardpurification protocols involving SDS lysis of the cells followed byProteinase K treatment and phenol/chloroform extractions. The isolatedDNA was digested with an appropriate restriction endonuclease,fractionated on agarose gels, blotted to nylon filters and hybridizedwith a radioactive probe derived either from the β-galactosidase or thehygromycin phosphotransferase genes. The level of hybridization wasquantified in a Molecular Dynamics PhosphorImage Analyzer. To controlthe total amount of DNA loaded from the different cells lines, thefilters were reprobed with a single copy gene, and the hydridization ofβ-galactosidase and hygromycin phosphotransferase genes was normalizedto the single copy gene hybridization.

[0422] g. Determination of protein concentration

[0423] The total protein content of the cell extracts was measured bythe Bradford colorimetric assay using bovine serum albumin as standard.

[0424] 2. Characterization of the β-galactosidase and HygromycinPhsophotransferase Activity Expressed in H1D3 and mM2C1 Cells

[0425] In order to establish quantative conditions, the most importantkinetic parameters of β-galactosidase and hygromycin phosphotransferaseactivity have been studied. The β-galactosidase activity measured with acolorimetric assay was linear between the 0.1-0.8 OD₄₂₀ range both forthe nM2C1 and H1D3 cell lines. The β-galactosidase activity was alsoproportional in both cell lines with the amount of protein added to thereaction mixture within 5-100 μg total protein concentration range. Thehygromycin phosphotransferase activity of nM2C1 and H1D3 cell lines wasalso proportional with the reaction time or the total amount of addedcell extract under the conditions described for the β-galactosidase.

[0426] a. Comparison of β-galactosidase activity of mM2C1 and H1D3 celllines

[0427] Cell extracts prepared from logarithmically growing mM2Cl andH1D3 cell lines were tested for β-galactosidase activity, and thespecific activities were compared in 10 independent experiments. Theβ-galactosidase activity of H1D3 cell extracts was 440±25 U/mg totalprotein. Under identical conditions the β-galactosidase activity of themM2C1 cell extracts was 4.8 times lower: 92±13 U/mg total protein.

[0428] β-galactosidase activities of highly subconfluent, subconfluentand nearly confluent cultures of H1D3 and mM2C1 cell lines were alsocompared. In these experiments different numbers of logarithmic H1D3 andmM2C1 cells were seeded in constant volume of culture medium and grownfor 3 days under standard conditions. No significant difference wasfound in the β-galactosidase specific activities of cell cultures grownat different cell densities, and the ratio of H1D3/mM2C1 β-galactosidasespecific activities was also similar for all three cell densities. Inconfluent, stationary cell cultures of H1D3 or mM2C1 cells, however, theexpression of β-galactosidase significantly decreased due likely tocessation of cell division as a result of contact inhibition.

[0429] b. Comparison of hygromycin phosphotransferase activity of H1D3and mM2C1 cell lines

[0430] The bacterial hygromycin phosphotransferase is present in amembrane-bound form in H1D3 or mM2C1 cell lines. This follows from theobservation that the hygromycin phosphotransferase activity can becompletely removed by high speed centrifugation of these cell extracts,and the enzyme activity can be recovered by resuspending the high speedpellet.

[0431] The ratio of the enzyme's specific activity in H1D3 and mM2C1cell lines was similar to that of β-galactosidase activity, i.e., H1D3cells have 4.1 times higher specific activity compared with mM2C1 cells.

[0432] C. Hygromycin phosphotransferase activity in H1D3 and mM2C1 cellsgrown under non-selective conditions

[0433] The level of expression of the hygromycin phosphotransferase genewas measured on the basis of quantitation of the specific enzymeactivities in H1D3 and mM2C1 cell lines grown under non-selectiveconditions for 30 generations. The absence of hygromycin in the mediumdid not influence the expression of the hygromycin phosphotransferasegene.

[0434] 3. Quantitation of the Number of β-galactosidase and HygromycinPhosphotransferase Gene Copies in H1D3 and mM2C1 Cell Lines

[0435] As described above, the β-galactosidase and hygromycinphosphotransferase genes are located only within the megachromosome, ormicro-megachromosome in H1D3 and mM2C1 cells. Quantitative analysis ofgenomic Southern blots of DNA isolated from H1D3 and mM2Cl cell lineswith the PhosphorImage Analyzer revealed that the copy number ofβ-galactosidase genes integrated into the megachromosome isapproximately 10 times higher in H1D3 cells than in mM2C1 cells. Thecopy-number of hygromycin phosphotransferase genes is approximately 7times higher in H1D3 cells than in mM2C1 cells.

[0436] 4. Summary and Conclusions of Results of Quantitation ofHeterologous Gene Expression in Cells Containing Megachromosomes orDerivatives thereof

[0437] Quantitative determination of β-galactosidase activity of highereukaryotic cells (e.g., H1D3 cells) carrying the bacterialβ-galactosidase gene in heterochromatic megachromosomes confirmed theobserved high-level expression of the integrated bacterial gene detectedby cytological staining methods. It has generally been established inreports of studies of the expression of foreign genes in transgenicanimals that, although transgene expression shows correct tissue anddevelopmental specificity, the level of expression is typically low andshows extensive position-dependent variability (i.e., the level oftransgene expression depends on the site of chromosomal integration). Itis has been assumed that the low-level transgene expression may be dueto the absence of special DNA sequences which can insulate the transgenefrom the inhibitory effect of the surrounding chromatin and promote theformation of active chromatin structure required for efficient geneexpression. Several cis-activing DNA sequence elements have beenidentified that abolish this position-dependent variability, and canensure high-level expression of the transgene locus activing region(LAR) sequences in higher eukaryotes and specific chromatin structure(scs) elements in lower eukaryotes (see, et al. Eissenberg and Elgin(1991) Trends in Genet. 7:335-340). If these cis-acting DNA sequencesare absent, the level of transgene expression is low and copy-numberindependent.

[0438] Although the bacterial β-galactosidase reporter gene contained inthe heterochromatic megachromosomes of H1D3 and mM2C1 cells is driven bya potent eukaryotic promoter-enhancer element, no specific cis-actingDNA sequence element was designed and incorporated into the bacterialDNA construct which could function as a boundary element. Thus, thehigh-level β-galactosidase expression measured in these cells is ofsignificance, particularly because the β-galactosidase gene in themegachromosome is located in a long, compact heterochromaticenvironment, which is known to be able to block gene expression. Themegachromosome appears to contain DNA sequence element(s) in associationwith the bacterial DNA sequences that function to override theinhibitory effect of heterochromatin on gene expression.

[0439] The specificity of the heterologous gene expression in themegachromosome is further supported by the observation that the level ofβ-galactosidase expression is copy-number dependent. In the H1D3 cellline, which carries a full-size megachromosome, the specific activity ofβ-galactosidase is about 5-fold higher than in mM2C1 cells, which carryonly a smaller, truncated version of the megachromosome. A comparison ofthe number of β-galactosidase gene copies in H1D3 and mM2C1 cell linesby quantitative hybridization techniques confirmed that the expressionof β-galactosidase is copy-number dependent. The number of integratedβ-galactosidase gene copies is approximately 10-fold higher in the H1D3cells than in mM2C1 cells. Thus, the cell line containing the greaternumber of copies of the β-galactosidase gene also yields higher levelsof β-galactosidase activity, which supports the copy-number dependencyof expression. The copy number dependency of the, β-galactosidase andhygromycin phosphotransferase enzyme levels in cell lines carryingdifferent derivatives of the megachromosome indicates that neither thechromatin organization surrounding the site of integration of thebacterial genes, nor the heterochromatic environment of themegachromosome suppresses the expression of the genes.

[0440] The relative amount of β-galactosidase protein expressed in H1D3cells can be estimated based on the V_(max) of this enzyme [500 forhomogeneous, crystallized bacterial β-galactosidase (Naider et al.(1972) Biochemistry 11:3202-3210)] and the specific activity of H1D3cell protein. A V_(max) of 500 means that the homogeneousβ-galactosidase protein hydrolyzes 500 μmoles of substrate per minuteper mg of enzyme protein at 37° C. One mg of total H1D3 cell proteinextract can hydrolyze 1.4 μmoles of substrate per minute at 37° C.,which means that 0.28% of the protein present in the H1D3 cell extractis β-galactosidase.

[0441] The hygromycin phosphotransferase is present in a membrane-boundform in H1D3 and mM2C1 cells. The tendency of the enzyme to integrateinto membranes in higher eukaryotic cells may be related to itsperiplasmic localization in prokaryotic cells. The bacterial hygromycinphosphotransferase has not been purified to homogeneity; thus, itsV_(max) has not been determined. Therefore, no estimate can be made onthe total amount of hygromycin phosphotransferase protein expressed inthese cell lines. The 4-fold higher specific activity of hygromycinphosphotransferase in H1D3 cells as compared to mM2C1 cells, however,indicates that its expression is also copy number dependent.

[0442] The constant and high level expression of the β-galactosidasegene in H1D3 and mM2C1 cells, particularly in the absence of anyselective pressure for the expression of this gene, clearly indicatesthe stability of the expression of genes carried in the heterochromaticmegachromosomes. This conclusion is further supported by the observationthat the level of hygromycin phosphotransferase expression did notchange when H1D3 and mM2C1 cells were grown under non-selectiveconditions. The consistent high-level, stable, and copy-number dependentexpression of bacterial marker genes clearly indicates that themegachromosome is an ideal vector system for expression of foreigngenes.

EXAMPLE 7

[0443] Summary of Some of the Cell Lines with SATACS and Minichromosomesthat have been Constructed

[0444] 1. EC3/7-Derived Cell Lines

[0445] The LMTK⁻-derived cell line, which is a mouse fibroblast cellline, was transfected with λCM8 and λgtWESneo DNA [see, EXAMPLE 2] toproduce transformed cell lines. Among these, was was EC3/7, deposited atthe European Collection of Animal cell Culture (ECACC) under AccessionNo. 90051001 [see, U.S. Pat. No. 5,288,625; see, also Hadlaczky et al.(1991) Proc. Natl. Acad. Sci. U.S.A. 88:8106-8110 and U.S. applicationSer. No. 08/375,271]. This cell line contains the dicentric chromosomewith the neo-centromere. Recloning and selection produced cell linessuch as EC3/7C5, which are cell lines with the stable neo-minichromosomeand the formerly dicentric chromosome [see, FIG. 2C].

[0446] 2. KE1-2/4 Cells

[0447] Fusion of EC3/7 with CHO-K20 cells and selection with G418/HATproduced hybrid cell lines, among these was KE1-2/4, which has beendeposited with the ECACC under Accession No. 96040924. KE1-2/4 is astable cell line that contains the λneo-chromosome [see, FIG. 2D; see,also U.S. Pat. No. 5,288,625], produced by E-type amplifications.KE1-2/4 has been transfected with vectors containing λ DNA, selectablemarkers, such as the puromycin-resistance gene, and genes of interest,such as p53 and the anti-HIV ribozyme gene. These vectors target thegene of interest into the λneo-chromosome by virtue of homologousrecombination with the heterologous DNA in the chromosome.

[0448] 3. C5pMCT53 Cells

[0449] The EC3/7C5 cell line has been co-transfected with pH132, pCH110and λ DNA [see, EXAMPLE 2] as well as other constructs. Various clonesand subclones have been selected. For example transformation with aconstruct that includes p53 encoding DNA, produced cells designatedC5pMCT53.

[0450]4. TF1004G24 Cells

[0451] As discussed above, cotransfection of EC3/7C5 cells with plasmids[pH132, pCH110 available from Pharmacia, see, also Hall et al. (1 983)J. Mol. Appl. Gen. 2:101-109] and with λ DNA [λcl 875 Sam 7 (New EnglandBiolabs)] produced transformed cells. Among these is TF1004G24, whichcontains the DNA encoding the anti-HIV ribozyme in theneo-mini-chromosome. Recloning of TF1004G24 produced numerous celllines. Among these is the NHHL24 cell line. This cell line also has theanti-HIV ribozyme in the neo-minichromosome and expresses high levels ofβ-gal. It has been fused with CHO-K20 cells to produce various hybrids.

[0452] 5. TF1004G19-Derived Cells

[0453] Recloning and selection of the TF1004G transformants produced thecell line TF1004G19, discussed above in EXAMPLE 4, which contains theunstable sausage chromosome and the neo-minichromosome. Single cellcloning produced the TF1004G-19C5 [see FIG. 4] cell line, which has astable sausage chromosome and the neo-minchromosome. TF1004G-19C5 hasbeen fused with CHO cells and the hybrids grown under selectiveconditions to produce the 19C5xHa4 and 19C5xHa3 cell lines [see, EXAMPLE4] and others. Recloning of the 19C5xHa3 cell line yielded a cell linecontaining a gigachromosome, i.e., cell line 19C5xHa47, see FIG. 2E.BrdU treatment of 19C5xHa4 cells and growth under selective conditions[neomycin (G) and/or hygromycin (H)] has produced hybrid cell lines suchas the G3D5 and G4D6 cell lines and others. G3D5 has theneo-minichromosome and the megachromosome. G4D6 has only theneo-minichromosome.

[0454] Recloning of 19C5xHa4 cells in H medium produced numerous clones.Among these is H1D3 [see FIG. 4], which has the stable megachromosome.Repeated BrdU treatment and recloning of H1D3 cells has produced theHB31 cell line, which has been used for transformations with thepTEMPUD, pTEMPU, pTEMPU3, and pCEPUR-132 vectors [see, Examples 12 and14, below].

[0455] H1D3 has been fused with a CD4⁺Hela cell line that carries DNAencoding CD4 and neomycin resistance on a plasmid [see, e.g., U.S. Pat.Nos. 5,413,914, 5,409,810, 5,266,600, 5,223,263, 5,215,914 and5,144,019, which describe these Hela cells]. Selection with GH hasproduced hybrids, including H1xHE41 [see FIG. 4], which carries themegachromosome and also a single human chromosome that includes theCD4neo construct. Repeated BrdU treatment and single cell cloning hasproduced cell lines with the megachromosome [cell line 1B3, see FIG. 4].About 25% of the 1B3 cells have a truncated megachromosome [˜90-120 Mb].Another of these subclones, designated 2C5, was cultured onhygromycin-containing medium and megachromosome-free cell lines wereobtained and grown in G418-containing medium. Recloning of these cellsyielded cell lines such as IB4 and others that have a dwarfmegachromosome [˜150-200 Mb], and cell lines, such as I1C3 and mM2C1,which have a micro-megachromosome [˜50-90 Mb]. The micro-megachromosomeof cell line mM2C1 has no telomeres; however, if desired, synthetictelomeres, such as those described and generated herein, may be added tothe mM2C1 cell micro-megachromosomes. Cell lines containing smallertruncated megachromosomes, such as the mM2C1 cell line containing themicro-megachromosome, can be used to generate even smallermegachromosomes, e.g., ˜10-30 Mb in size. This may be accomplished, forexample, by breakage and fragmentation of the micro-megachromosome inthese cells through exposing the cells to X-ray irradiation, BrdU ortelomere-directed in vivo chromosome fragmentation.

EXAMPLE 8

[0456] Replication of the Megachromosome

[0457] The homogeneous architecture of the megachromomes provides aunique opportunity to perform a detailed analysis of the replication ofthe constitutive heterochromatin.

[0458] A. Materials and methods

[0459] 1. Culture of cell lines

[0460] H1D3 mouse-hamster hybrid cells carrying the megachromosome [see,EXAMPLE 4] were cultured in F-12 medium containing 10% fetal calf serum[FCS] and 400 μg/ml Hygromycin B [Calbiochem]. G3D5 hybrid cells [see,Example 4] were maintained in F-12 medium containing 10% FCS, 400 μg/mlHygromycin B (Calbiochem), and 400 μg/ml G418 [SIGMA]. Mouse A9fibroblast cells were cultured in F-12 medium supplemented with 10% FCS.

[0461] 2. BrdU labelling

[0462] In typical experiments, 20-24 parallel semi-confluent cellcultures were set up in 10 cm Petri dishes. Bromodeoxyuridine (BrdU)(Fluka) was dissolved in distilled water alkalized with a drop of NaOH,to make a 10⁻² M stock solution. Aliquots of 10-50 μl of this BrdU stocksolution were added to each 10 ml culture, to give a final BrdUconcentration of 10-50 μM. The cells were cultured in the presence ofBrdU for 30 min, and then washed with warm complete medium, andincubated without BrdU until required. At this point, 5 μg/ml colchicinewas added to a sample culture every 1 or 2 h. After 1-2 h colchicinetreatment, mitotic cells were collected by “shake-off” and regularchromosome preparations were made for immunolabelling.

[0463] 3. Immunolabelling of chromosomes and in situ hybridization

[0464] Immunolabelling with fluorescein-conjugated anti-BrdU monoclonalantibody (Boehringer) was done according to the manufacturer'srecommendations, except that for mouse A9 chromosomes, 2 M hydrochloricacid was used at 37° C. for 25 min, while for chromosomes of hybridcells, 1 M hydrochloric acid was used at 37° C. for 30 min. In situhybridization with biotin-labelled probes, and indirectimmunofluorescence and in situ hybridization on the same preparation,were performed as described previously [Hadlaczky et al. (1991) Proc.Natl. Acad. Sci. U.S.A. 88:8106-8110, see, also U.S. Pat. No.5,288,625].

[0465] 4. Microscopy

[0466] All observations and microphotography were made by using a VanoxAHBS (Olympus) microscope. Fujicolor 400 Super G or Fujicolor 1600 SuperHG high-speed colour negatives were used for photographs.

[0467] B. Results

[0468] The replication of the megachromosome was analyzed by BrdU pulselabelling followed by immunolabelling. The basic parameters for DNAlabelling in vivo were first established. Using a 30-min pulse of 50 μMBrdU in parallel cultures, samples were taken and fixed at 5 minintervals from the beginning of the pulse, and every 15 min up to 1 hafter the removal of BrdU. Incorporated BrdU was detected byimmunolabelling with fluorescein-conjugated anti-BrdU monoclonalantibody. At the first time point (5 min) 38% of the nuclei werelabelled, and a gradual increase in the number of labelled nuclei wasobserved during incubation in the presence of BrdU, culminating in 46%in the 30-min sample, at the time of the removal of BrdU. At furthertime points (60, 75, and 90 min) no significant changes were observed,and the fraction of labelled nuclei remained constant [44.5-46%].

[0469] These results indicate that (i) the incorporation of the BrdU isa rapid process, (ii) the 30 min pulse-time is sufficient for reliablelabelling of S-phase nuclei, and (iii) the BrdU can be effectivelyremoved from the cultures by washing.

[0470] The length of the cell cycle of the H1D3 and G3D5 cells wasestimated by measuring the time between the appearance of the earliestBrdU signals on the extreme late replicating chromosome segments and theappearance of the same pattern only on one of the chromatids of thechromosomes after one completed cell cycle. The length of G2 period wasdetermined by the time of the first detectable BrdU signal on prophasechromosomes and by the labelled mitoses method [Qastler et al. (1959)Exp. Cell Res. 17:420-438]. The length of the S-phase was determined inthree ways: (i) on the basis of the length of cell cycle and thefraction of nuclei labelled during the 30-120 min pulse; (ii) bymeasuring the time between the very end of the replication of theextreme late replicating chromosomes and the detection of the firstsignal on the chromosomes at the beginning of S phase; (iii) by thelabelled mitoses method. In repeated experiments, the duration of thecell cycle was found to be 22-26 h, the S phase 10-14 h, and the G2phase 3.5-4.5 h.

[0471] Analyses of the replication of the megachromosome were made inparallel cultures by collecting mitotic cells at two hour intervalsfollowing two hours of colchicine treatment. In a repeat experiment, thesame analysis was performed using one hour sample intervals and one hourcolchicine treatment. Although the two procedures gave comparableresults, the two hour sample intervals were viewed as more appropriatesince approximately 30% of the cells were found to have a considerablyshorter or longer cell cycle than the average. The characteristicreplication patterns of the individual chromosomes, especially some ofthe late replicating hamster chromosomes, served as useful internalmarkers for the different stages of S-phase. To minimize the errorcaused by the different lengths of cell cycles in the differentexperiments, samples were taken and analyzed throughout the whole cellcycle until the appearance of the first signals on one chromatid at thebeginning of the second S-phase.

[0472] The sequence of replication in the megachromosome is as follows.At the very beginning of the S-phase, the replication of themegachromosome starts at the ends of the chromosomes. The firstinitiation of replication in an interstitial position can usually bedetected at the centromeric region. Soon after, but still in the firstquarter of the S-phase, when the terminal region of the short arm hasalmost completed its replication, discrete initiation signals appearalong the chromosome arms. In the second quarter of the S-phase, asreplication proceeds, the BrdU-labelled zones gradually widen, and thecheckered pattern of the megachromosome becomes clear [see, e.g., FIG.2F]. At the same time, pericentric regions of mouse chromosomes alsoshow intense incorporation of BrdU. The replication of themegachromosome peaks at the end of the second quarter and in the thirdquarter of the S-phase. At the end of the third quarter, and at the verybeginning of the last quarter of the S-phase, the megachromosome and thepericentric heterochromatin of the mouse chromosomes complete theirreplication. By the end of S-phase, only the very late replicatingsegments of mouse and hamster chromosomes are still incorporating BrdU.

[0473] The replication of the whole genome occurs in distinct phases.The signal of incorporated BrdU increased continuously until the end ofthe first half of the S-phase, but at the beginning of the third quarterof the S-phase chromosome segments other than the heterochromaticregions hardly incorporated BrdU. In the last quarter of the S-phase,the BrdU signals increased again when the extreme late replicatingsegments showed very intense incorporation.

[0474] Similar analyses of the replication in mouse A9 cells wereperformed as controls. To increase the resolution of the immunolabellingpattern, pericentric regions of A9 chromosomes were decondensed bytreatment with Hoechst 33258. Because of the intense replication of thesurrounding euchromatic sequences, precise localization of the initialBrdU signal in the heterochromatin was normally difficult, even onundercondensed mouse chromosomes. On those chromosomes where theinitiation signal(s) were localized unambiguously, the replication ofthe pericentric heterochromatin of A9 chromosomes was similar to that ofthe megachromosome. Chromosomes of A9 cells also exhibited replicationpatterns and sequences similar to those of the mouse chromosomes in thehybrid cells. These results indicate that the replicators of themegachromosome and mouse chromosomes retained their original timing andspecificity in the hybrid cells.

[0475] By comparing the pattern of the initiation sites obtained afterBrdU incorporation with the location of the integration sites of the“foreign” DNA in a detailed analysis of the first quarter of theS-phase, an attempt was made to identify origins of replication(initiation sites) in relation to the amplicon structure of themegachromosome. The double band of integrated DNA on the long arm of themegachromosome served as a cytological marker. The results showed acolocalization of the BrdU and in situ hybridization signals found atthe cytological level, indicating that the “foreign” DNA sequences arein close proximity to the origins of replication, presumably integratedinto the non-satellite sequences between the replicator and thesatellite sequences [see, FIG. 3]. As described in Example 6.B.4, therDNA sequences detected in the megachromosome are also localized at theamplicon borders at the site of integration of the “foreign” DNAsequences, suggesting that the origins of replication responsible forinitiation of replication of the megachromosome involve rDNA sequences.In the pericentric region of several other chromosomes, dot-like BrdUsignals can also be observed that are comparable to the initiationsignals on the megachromosome. These signals may represent similarinitiation sites in the heterochromatic regions of normal chromosomes.

[0476] At a frequency of 10⁻⁴, “uncontrolled” amplification of theintegrated DNA sequences was observed in the megachromosome. Consistentwith the assumption (above) that “foreign” sequences are in proximity ofthe replicators, this spatially restricted amplification is likely to bea consequence of uncontrolled repeated firings of the replicationorigin(s) without completing the replication of the whole segment.

[0477] C. Discussion

[0478] It has generally been thought that the constitutiveheterochromatin of the pericentric regions of chromosomes is latereplicating [see, em, Miller (1976) Chromosoma 55:165-170]. On thecontrary, these experiments evidence that the replication of theheterochromatic blocks starts at a discrete initiation site in the firsthalf of the S-phase and continues through approximately three-quartersof S-phase. This difference can be explained in the following ways: (i)in normal chromosomes, actively replicating euchromatic sequences thatsurround the satellite DNA obscure the initiation signals, and thus theprecise localization of initiation sites is obscured; (ii) replicationof the heterochromatin can only be detected unambiguously in a periodduring the second half of the S-phase, when the bulk of theheterochromatin replicates and most other chromosomal regions havealready completed their replication, or have not yet started it. Thus,low resolution cytological techniques, such as analysis of incorporationof radioactively labelled precursors by autoradiography, only detectprominent replication signals in the heterochromatin in the second halfof S-phase, when adjacent euchromatic segments are no longerreplicating.

[0479] In the megachromosome, the primary initiation sites ofreplication colocalize with the sites where the “foreign” DNA sequencesand rDNA sequences are integrated at the amplicon borders. Similarinitiation signals were observed at the same time in the pericentricheterochromatin of some of the mouse chromosomes that do not have“foreign” DNA, indicating that the replication initiation sites at theborders of amplicons may reside in the non-satellite flanking sequencesof the satellite DNA blocks. The presence of a primary initiation siteat each satellite DNA doublet implies that this large chromosome segmentis a single huge unit of replication [megareplicon] delimited by theprimary initiation site and the termination point at each end of theunit. Several lines of evidence indicate that, within this higher-orderreplication unit, “secondary” origins and replicons contribute to thecomplete replication of the megareplicon:

[0480] 1. The total replication time of the heterochromatic regions ofthe megachromosome was ˜9-11 h. At the rate of movement of replicationforks, 0.5-5 kb per minute, that is typical of eukaryotic chromosomes[Kornberg et al. (1992) DNA Replication. 2nd. ed., New York: W. H.Freeman and Co, p. 474], replication of a ˜15 Mb replicon would require50-500 h. Alternatively, if only a single replication origin was used,the average replication speed would have to be 25 kb per minute tocomplete replication within 10 h. By comparing the intensity of the BrdUsignals on the euchromatic and the heterochromatic chromosome segments,no evidence for a 5- to 50-fold difference in their replication speedwas found.

[0481] 2. Using short BrdU pulse labelling, a single origin ofreplication would produce a replication band that moves along thereplicon, reflecting the movement of the replication fork. In contrast,a widening of the replication zone that finally gave rise to thecheckered pattern of the megachromosome was observed, and within thereplication period, the most intensive BrdU incorporation occurred inthe second half of the S-phase. This suggests that once themegareplicator has been activated, it permits the activation and firingof “secondary” origins, and that the replication of the bulk of thesatellite DNA takes place from these “secondary” origins during thesecond half of the S-phase. This is supported by the observation that incertain stages of the replication of the megachromosome, the wholeamplicon can apparently be labelled by a short BrdU pulse.

[0482] Megareplicators and secondary replication origins seem to beunder strict temporal and spatial control. The first initiation withinthe megachromosomes usually occurred at the centromere, and shortlyafterward all the megareplicators become active. The last segment of themegachromosome to complete replication was usually the second segment ofthe long arm. Results of control experiments with mouse A9 chromosomesindicate that replication of the heterochromatin of mouse chromosomescorresponds to the replication of the megachromosome amplicons.Therefore, the pre-existing temporal control of replication in theheterochromatic blocks is preserved in the megachromosome. Positive[Hassan et al. (1994) J. Cell. Sci. 107:425-434] and negative [Haase etal. (1994) Mol. Cell. Biol. 14:2516-2524] correlations betweentranscriptional activity and initiation of replication have beenproposed. In the megachromosome, transcription of the integrated genesseems to have no effect on the original timing of the replicationorigins. The concerted, precise timing of the megareplicator initiationsin the different amplicons suggests the presence of specific, cis-actingsequences, origins of replication.

[0483] Considering that pericentric heterochromatin of mouse chromosomescontains thousands of short, simple repeats spanning 7-15 Mb, and thecentromere itself may also contain hundreds of kilobases, the existenceof a higher-order unit of replication seems probable. The observeduncontrolled intrachromosomal amplification restricted to a replicationinitiation region of the megachromosome is highly suggestive of arolling-circle type amplification, and provides additional evidence forthe presence of a replication origin in this region.

[0484] The finding that a specific replication initiation site occurs atthe boundaries of amplicons suggests that replication might play a rolein the amplification process. These results suggest that each ampliconof the megachromosome can be regarded as a huge megareplicon defined bya primary initiation site [megareplicator] containing “secondary”origins of replication. Fusion of replication bubbles from differentorigins of bi-directional replication [DePamphilis (1993) Ann. Rev.Biochem. 62:29-63] within the megareplicon could form a giantreplication bubble, which would correspond to the whole megareplicon. Inthe light of this, the formation of megabase-size amplicons can beaccommodated by a replication-directed amplification mechanism. In H andE-type amplifications, intrachromosomal multiplication of the ampliconswas observed [see, above EXAMPLES], which is consistent with the unequalsister chromatid exchange model. Induced or spontaneous unscheduledreplication of a megareplicon in the constitutive heterochromatin mayalso form new amplicon(s) leading to the expansion of the amplificationor to the heterochromatic polymorphism of “normal” chromosomes. The“restoration” of the missing segment on the long arm of themegachromosome may well be the result of the re-replication of oneamplicon limited to one strand.

[0485] Taken together, without being bound by any theory, areplication-directed mechanism is a plausible explanation for theinitiation of large-scale amplifications in the centromeric regions ofmouse chromosomes, as well as for the de novo chromosome formations. Ifspecific [amplificator, i.e., sequences controlling amplification]sequences play a role in promoting the amplification process, sequencesat the primary replication initiation site [megareplicator] of themegareplicon are possible candidates.

[0486] The presence of rRNA gene sequence at the amplicon borders nearthe foreign DNA in the megachromosome suggests that this sequencecontributes to the primary replication initiation site and participatesin large-scale amplification of the pericentric heterochromatin in denovo formation of SATACs. Ribosomal RNA genes have an intrinsicamplification mechanism that provides for multiple copies of tandemgenes. Thus, for purposes herein, in the construction of SATACs incells, rDNA will serve as a region for targeted integration, and ascomponents of SATACs constructed in vitro.

EXAMPLE 9

[0487] Generation of chromosomes with amplified regions derived frommouse chromosome 1

[0488] To show that the events described in EXAMPLES 2-7 are not uniqueto mouse chromosome 7 and to show that the EC7/3 cell line is notrequired for formation of the artificial chromosomes, the experimentshave been repeated using different initial cell lines and DNA fragments.Any cell or cell line should be amenable to use or can readily bedetermined that it is not.

[0489] A. Materials

[0490] The LP11 cell line was produced by the “scrape-loading”transfection method [Fechheimer et al. (1987) Proc. Natl. Acad. Sci.U.S.A. 84:8463-8467] using 25 μg plasmid DNA for 5×10⁶ recipient cells.LP11 cells were maintained in F-12 medium containing 3-15 μg/mlPuromycin [SIGMA].

[0491] B. Amplification in LP11 cells

[0492] The large-scale amplification described in the above Examples isnot restricted to the transformed EC3/7 cell line or to the chromosome 7of mouse. In an independent transformation experiment, LMTK⁻ cells weretransfected using the calcium phosphate precipitation procedure with aselectable puromycin-resistance gene-containing construct designatedpPuroTel [see Example 1.E.2. for a description of this plasmid], toestablish cell line LP11. Cell line LP11 carries chromosome(s) withamplified chromosome segments of different lengths [˜150-600 Mb].Cytological analysis of the LP11 cells indicated that the amplificationoccurred in the pericentric region of the long arm of a submetacentricchromosome formed by Robertsonian translocation. This chromosome arm wasidentified by G-banding as chromosome 1. C-banding and in situhybridization with mouse major satellite DNA probe showed that an E-typeamplification had occurred: the newly formed region was composed of anarray of euchromatic chromosome segments containing different amounts ofheterochromatin. The size and C-band pattern of the amplified segmentswere heterogeneous. In several cells, the number of these amplifiedunits exceeded 50; single-cell subclones of LP11 cell lines, however,carry stable marker chromosomes with 10-15 segments and constant C-bandpatterns.

[0493] Sublines of the thymidine kinase-deficient LP11 cells (e.g.,LP11-15P 1C5/7 cell line) established by single-cell cloning of LP11cells were transfected with a thymidine kinase gene construct. StableTK⁺ transfectants were established.

EXAMPLE 10

[0494] Isolation of SATACS and other Chromosomes with Atypical BaseContent and/or Size

[0495] I. Isolation of artificial chromosomes from endogenouschromosomes

[0496] Artificial chromosomes, such as SATACs, may be sorted fromendogenous chromosomes using any suitable procedures, and typicallyinvolve isolating metaphase chromosomes, distinguishing the artificialchromosomes from the endogenous chromosomes, and separating theartificial chromosomes from endogenous chromosomes. Such procedures willgenerally include the following basic steps: (1) culture of a sufficientnumber of cells (typically about 2×10⁷ mitotic cells) to yield,preferably on the order of 1×10⁶ artificial chromosomes, (2) arrest ofthe cell cycle of the cells in a stage of mitosis, preferrablymetaphase, using a mitotic arrest agent such as colchicine, (3)treatment of the cells, particularly by swelling of the cells inhypotonic buffer, to increase susceptibility of the cells to disruption,(4) by application of physical force to disrupt the cells in thepresence of isolation buffers for stabilization of the releasedchromosomes, (5) dispersal of chromosomes in the presence of isolationbuffers for stabilization of free chromosomes, (6) separation ofartificial from endogenous chromosomes and (7) storage (and shipping ifdesired) of the isolated artificial chromosomes in appropriate buffers.Modifications and variations of the general procedure for isolation ofartificial chromosomes, for example to accommodate different cell typeswith differing growth characteristics and requirements and to optimizethe duration of mitotic block with arresting agents to obtain thedesired balance of chromosome yield and level of debris, may beempirically determined.

[0497] Steps 1-5 relate to isolation of metaphase chromosomes. Theseparation of artificial from endogenous chromosomes (step 6) may beaccomplished in a variety of ways. For example, the chromosomes may bestained with DNA-specific dyes such as Hoeschst 33258 and chromomycin A₃and sorted into artificial and endogenous chromosomes on the basis ofdye content by employing fluorescence-activated cell sorting (FACS). Tofacilitate larger scale isolation of the artificial chromosomes,different separation techiniques may be employed such as swinging bucketcentrifugation (to effect separation based on chromosome size anddensity) [see, e.g., Mendelsohn et al. (1968) J. Mol. Biol. 32:101-108],zonal rotor centrifugation (to effect separation on the basis ofchromosome size and density) [see, e.g., Burki et al. (1973) Prep.Biochem. 3:157-182; Stubblefield et al. (1978) Biochem. Biophys. Res.Commun. 83:1404-1414, velocity sedimentation (to effect separation onthe basis of chromosome size and shape) [see e.g., Collard et al. (1984)Cytometry 5:9-19]. Immuno-affinity purification may also be employed inlarger scale artificial chromosome isolation procedures. In thisprocess, large populations of artificial chromosome-containing cells(asynchronous or mitotically enriched) are harvested en masse and themitotic chromosomes (which can be released from the cells using standardprocedures such as by incubation of the cells in hypotonic buffer and/ordetergent treatment of the cells in conjunction with physical disruptionof the treated cells) are enriched by binding to antibodies that arebound to solid state matrices (e.g. column resins or magnetic beads).Antibodies suitable for use in this procedure bind to condensedcentromeric proteins or condensed and DNA-bound histone proteins. Forexample, autoantibody LU851 (see Hadlaczky et al. (1989) Chromosoma97:282-288), which recognizes mammalian centromeres may be used forlarge-scale isolation of chromosomes prior to subsequent separation ofartificial from endogenous chromosomes using methods such as FACS. Thebound chromosomes would be washed and eventually eluted for sorting.Immunoaffinity purification may also be used directly to separateartificial chromosomes from endogenous chromosomes. For example, SATACsmay be generated in or transferred to (e.g., by microinjection ormicrocell fusion as described herein) a cell line that has chromosomesthat contain relatively small amounts of heterochromatin, such ashamster cells (e.g., V79 cells or CHO-K1 cells). The SATACs, which arepredominantly heterochromatin, are then separated from the endogenouschromosomes by utilizing anti-heterochromatin binding protein(Drosophila HP-1) antibody conjugated to a solid matrix. Such matrixpreferentially binds SATACs relative to hamster chromosomes. Unboundhamster chromosomes are washed away from the matrix and the SATACs areeluted by standard techniques.

[0498] A. Cell lines and cell culturing procedures

[0499] In one isolation procedure, 1B3 mouse-hamster-human hybrid cells[see, FIG. 4] carrying the megachromosome or the truncatedmegachromosome were grown in F-12 medium supplemented with 10% fetalcalf serum, 150 μg/ml hygromycin B and 400 μg/ml G418. GHB42 [a cellline recloned from G3D5 cells] mouse-hamster hybrid cells carrying themegachromosome and the minichromosome were also cultured in F-12 mediumcontaining 10% fetal calf serum, 150 μg/ml hygromycin B and 400 μg/mlG418. The doubling time of both cell lines was about 24-40 hours,typically about 32 hours.

[0500] Typically, cell monolayers are passaged when they reach about60-80% confluence and are split every 48-72 hours. Cells that reachgreater than 80% confluence senesce in culture and are not preferred forchromosome harvesting. Cells may be plated in 100-200 100-mm dishes atabout 50-70% confluency 12-30 hours before mitotic arrest (see, below).

[0501] Other cell lines that may be used as hosts for artificialchromosomes and from which the artificial chromosomes may be isolatedinclude, but are not limited to, PtK1 (NBL-3) marsupial kidney cells(ATCC accession no. CCL35), CHO-K1 Chinese hamster ovary cells (ATCCaccession no. CCL61), V79-4 Chinese hamster lung cells (ATCC accessionno. CCL93), Indian muntjac skin cells (ATCC accession no. CCL157),LMTK(−) thymidine kinase deficient murine L cells (ATCC accession no.CCL1.3), Sf9 fall armyworm (Spodoptera frugiperda) ovary cells (ATCCaccession no. CRL 1711) and any generated heterokaryon (hybrid) celllines, such as, for example, the hamster-murine hybrid cells describedherein, that may be used to construct MACs, particularly SATACs.

[0502] Cell lines may be selected, for example, to enhance efficiency ofartificial chromosome production and isolation as may be desired inlarge-scale production processes. For instance, one consideration inselecting host cells may be the artificial chromosome-to-totalchromosome ratio of the cells. To facilitate separation of artificialchromosomes from endogenous chromosomes, a higher artificialchromosome-to-total chromosome ratio might be desirable. For example,for H1D3 cells (a murine/hamster heterokaryon; see FIG. 4), this ratiois 1:50, i.e., one artificial chromosome (the megachromosome) to 50total chromosomes. In contrast, Indian muntjac skin cells (ATCCaccession no. CCL157) contain a smaller total number of chromosomes (adiploid number of chromosomes of 7), as do kangaroo rat cells (a diploidnumber of chromosomes of 12) which would provide for a higher artificialchromosome-to-total chromosome ratio upon introduction of, or generationof, artificial chromosomes in the cells.

[0503] Another consideration in selecting host cells for production andisolation of artificial chromosomes may be size of the endogenouschromosomes as compared to that of the artificial chromosomes. Sizedifferences of the chromosomes may be exploited to facilitate separationof artificial chromosomes from endogenous chromosomes. For example,because Indian muntjac skin cell chromosomes are considerably largerthan minichromosomes and truncated megachromosomes, separation of theartificial chromosome from the muntjac chromosomes may possibly beaccomplished using univariate (one dye, either Hoechst 33258 orChromomycin A3) FACS separation procedures.

[0504] Another consideration in selecting host cells for production andisolation of artificial chromosomes may be the doubling time of thecells. For example, the amount of time required to generate a sufficientnumber of artificial chromosome-containing cells for use in proceduresto isolate artificial chromosomes may be of significance for large-scaleproduction. Thus, host cells with shorter doubling times may bedesirable. For instance, the doubling time of V79 hamster lung cells isabout 9-10 hours in comparison to the approximately 32-hour doublingtime of H1D3 cells.

[0505] Accordingly, several considerations may go into the selection ofhost cells for the production and isolation of artificial chromosomes.It may be that the host cell selected as the most desirable for de novoformation of artificial chromosomes is not optimized for large-scaleproduction of the artificial chromosomes generated in the cell line. Insuch cases, it may be possible, once the artificial chromosome has beengenerated in the initial host cell line, to transfer it to a productioncell line more well suited to efficient, high-level production andisolation of the artificial chromosome. Such transfer may beaccomplished through several methods, for example through microcellfusion, as described herein, or microinjection into the production cellline of artificial chromosomes purified from the generating cell lineusing procedures such as described herein. Production cell linespreferably contain two or more copies of the artificial artificialchromosome per cell.

[0506] B. Chromosome isolation

[0507] In general, cells are typically cultured for two generations atexponential growth prior to mitotic arrest. To accumulate mitotic 1 B3and GHB42 cells in one particular isolation procedure, 5 μg/mlcolchicine was added for 12 hours to the cultures. The mitotic indexobtained was 60-80%. The mitotic cells were harvested by selectivedetachment by gentle pipetting of the medium on the monolayer cells. Itis also possible to utilize mechanical shake-off as a means of releasingthe rounded-up (mitotic) cells from the plate. The cells were sedimentedby centrifugation at 200×g for 10 minutes.

[0508] Cells (grown on plastic or in suspension) may be arrested indifferent stages of the cell cycle with chemical agents other thancolchicine, such as hydroxyurea, vinblastine, colcemid or aphidicolin.Chemical agents that arrest the cells in stages other than mitosis, suchas hydroxyurea and aphidicolin, are used to synchronize the cycles ofall cells in the population and then are removed from the cell medium toallow the cells to proceed, more or less simultaneously, to mitosis atwhich time they may be harvested to disperse the chromosomes. Mitoticcells could be enriched for a mechanical shake-off (adherent cells). Thecell cycles of cells within a population of MAC-containing cells mayalso be synchronized by nutrient, growth factor or hormone deprivationwhich leads to an accumulation of cells in the G₁ or G₀ stage;readdition of nutrients or growth factors then allows the quiescentcells to re-enter the the cell cycle in synchrony for abot onegeneration. Cell lines that are known to respond to hormone deprivationin this manner, and which are suitable as hosts for artificialchromosomes, include the Nb2 rat lymphoma cell line which is absolutelydependent on prolactin for stimulation of proliferation (see Gout et al.(1980) Cancer Res. 40:2433-2436). Culturing the cells inprolactin-deficient medium for 18-24 hours leads to arrest ofproliferation, with cells accumulating early in the G₁ phase of the cellcycle. Upon addition of prolactin, all the cells progress through thecell cycle until M phase at which point greater than 90% of the cellswould be in mitosis (addition of colchicine could increase the amount ofthe mitotic cells to greater than 95%). The time between reestablishingproliferation by prolactin addition and harvesting mitotic cells forchromosome separation may be empirically determined.

[0509] Alternatively, adherent cells, such as V79 cells, may be grown inroller bottles and mitotic cells released from the plastic surface byrotating the roller bottles at 200 rpm or greater (Shwarchuk et al.(1993) Int. J. Radiat. Biol. 64:601-612). At any given time,approximately 1% of the cells in an exponentially growing asynchronouspopulation is in M-phase. Even without the addition of colchicine, 2×10⁷mitotic cells have been harvested from four 1750-cm² roller bottlesafter a 5-min spin at 200 rpm. Addition of colchicine for 2 hours mayincrease the yield to 6×10⁸ mitotic cells.

[0510] Several procedures may be used to isolate metaphase chromosomesfrom these cells, including, but not limited to, one based on apolyamine buffer system [Cram et al. (1990) Methods in Cell Biology33:377-382], one on a modified hexylene glycol buffer system [Hadlaczkyet al. (1 982) Chromosoma 86:643-65], one on a magnesium sulfate buffersystem [Van den Engh et al. (1988) Cytometry 9:266-270 and Van den Enghet al. (1984) Cytometry 5:108], one on an acetic acid fixation buffersystem [Stoehr et al. (1982) Histochemistry 74:57-61], and one on atechnique utilizing hypotonic KCl and propidium iodide [Cram et al.(1994) XVII meeting of the International Society for AnalyticalCytology, October 16-21, Tutorial IV Chromosome Analysis and Sortingwith Commerical Flow Cytometers; Cram et al. (1990) Methods in CellBiology 33:376].

[0511] 1. Polyamine procedure

[0512] In the polyamine procedure that was used in isolating artificialchromosomes from either 1B3 or GHB42 cells, about 10⁷ mitotic cells wereincubated in 10 ml hypotonic buffer (75 mM KCl, 0.2 mM spermine, 0.5 mMspermidine) for 10 minutes at room temperature to swell the cells. Thecells are swollen in hypotonic buffer to loosen the metaphasechromosomes but not to the point of cell lysis. The cells were thencentrifuged at 100×g for 8 minutes, typically at room temperature. Thecell pellet was drained carefully and about 10⁷ cells were resuspendedin 1 ml polyamine buffer [15 mM Tris-HCl, 20 mM NaCl, 80 mM KCl, 2 mMEDTA, 0.5 mM EGTA, 14 mM β-mercaptoethanol, 0.1% digitonin, 0.2 mMSpermine, 0.5 mM spermidine] for physical dispersal of the metaphasechromosomes. Chromosomes were then released by gently drawing the cellsuspension up and expelling it through a 22 G needle attached to a 3 mlplastic syringe. The chromosome concentration was about 1-3×10⁸chromosomes/ml.

[0513] The polyamine buffer isolation protocol is well suited forobtaining high molecular weight chromosomal DNA [Sillar and Young (1981)J. Histochem. Cytochem. 29:74-78; VanDilla et al. (1986) Biotechnology4:537-552; Bartholdi et al. (1988) In “Molecular Genetics of MammalianCells” (M.Goettsman, ed.), Methods in Enzymology 151:252-267. AcademicPress, Orlando]. The chromosome stabilizing buffer uses the polyaminesspermine and spermidine to stabilize chromosome structure [Blumenthal etal. (1 979)J. Cell Biol. 81:255-259; Lalande et al. (1985) Cancer Genet.Cytogenet. 23:151-157] and heavy metals chelators to reduce nucleaseactivity.

[0514] The polyamine buffer protocol has wide applicability, however, aswith other protocols, the following variables must be optimized for eachcell type: blocking time, cell concentration, type of hypotonic swellingbuffer, swelling time, volume of hypotonic buffer, and vortexing time.Chromosomes prepared using this protocol are typically highly condensed.

[0515] There are several hypotonic buffers that may be used to swell thecells, for example buffers such as the following: 75 mM KCl; 75 mM KCl,0.2 mM spermine, 0.5 mM spermidine; Ohnuki's buffer of 16.2 mM sodiumnitrate, 6.5 mM sodium acetate, 32.4 mM KCl [Ohnuki (1965) Nature208:916-917 and Ohnuki (1968) Chromosoma 25:402-428]; and a variation ofOhnuki's buffer that additionally contains 0.2 mM spermine and 0.5 mMspermidine. The amount and hypotonicity of added buffer vary dependingon cell type and cell concentration. Amounts may range from 2.5-5.5 mlper 10⁷ cells or more. Swelling times may vary from 10-90 minutesdepending on cell type and which swelling buffer is used.

[0516] The composition of the polyamine isolation buffer may also bevaried. For example, one modified buffer contains 15 mM Tris-HCl, pH7.2, 70 mM NaCl, 80 mM KCl, 2 mM EDTA, 0.5 mM EGTA, 14 mMbeta-mercaptoethanol, 0.25% Triton-X, 0.2 mM spermine and 0.5 mMspermidine.

[0517] Chromosomal dispersal may also be accomplished by a variety ofphysical means. For example, cell suspension may be gently drawn up andexpelled in a 3-ml syringe fitted with a 22-gauge needle [Cram et al.(1990) Methods in Cell Biology 33:377-382], cell suspension may beagitated on a bench-top vortex [Cram et al. (1990) Methods in CellBiology 33:377-382], cell suspension may be disrupted with a homogenizer[Sillar and Young (1981) J. Histochem. Cytochem. 29:74-78; Carrano etal. (1979) Proc. Natl. Acad. Sci. U.S.A. 76:1382-1384] and cellsuspension may be disrupted with a bench-top ultrasonic bath [Stoehr etal. (1982) Histochemistry 74:57-61].

[0518] 2. Hexylene glycol buffer system

[0519] In the hexylene glycol buffer procedure that was used inisolating artificial chromosomes from either 1B3 or GHB42 cells, about8×10⁶ mitotic cells were resuspended in 10 ml glycine-hexylene glycolbuffer [100 mM glycine, 1% hexylene glycol, pH 8.4-8.6 adjusted withsaturated Ca-hydroxide solution] and incubated for 10 minutes at 37° C.,followed by centrifugation for 10 minutes to pellet the nuclei. Thesupernatant was centrifuged again at 200×g for 20 minutes to pellet thechromosomes. Chromosomes were resuspended in isolation buffer (1-3×10⁸chromosomes/ml).

[0520] The hexylene glycol buffer composition may also be modified. Forexample, one modified buffer contains 25 mM Tris-HCl, pH 7.2, 750 mMhexylene glycol, 0.5 mM CaCl₂, 1.0 mM MgCl₂ [Carrano et al. (1 979)Proc. Natl. Acad. Sci. U.S.A. 76:1382-1384].

[0521] 3. Magnesium-sulfate buffer system

[0522] This buffer system may be used with any of the methods of cellswelling and chromosomal dispersal, such as described above inconnection with the polyamine and hexylene glycol buffer systems. Inthis procedure, mitotic cells are resuspended in the following buffer:4.8 mM HEPES, pH 8.0, 9.8 mM MgSO₄, 48 mM KCl, 2.9 mM dithiothreitol[Van den Engh et al. (1985) Cytometry 6:92 and Van den Engh et al.(1984) Cytometry 5:108].

[0523] 4. Acetic acid fixation buffer system

[0524] This buffer system may be used with any of the methods of cellswelling and chromosomal dispersal, such as described above inconnection with the polyamine and hexylene glycol buffer systems. Inthis procedure, mitotic cells are resuspended in the following buffer:25 mM Tris-HCl, pH 3.2, 750 mM (1,6)-hexandiol, 0.5 mM CaCl₂, 1.0%acetic acid [Stoehr et al. (1982) Histochemistry 74:57-61].

[0525] 5. KCl-propidium iodide buffer system

[0526] This buffer system may be used with any of the methods of cellswelling and chromosomal dispersal, such as described above inconnection with the polyamine and hexylene glycol buffer systems. Inthis procedure, mitotic cells are resuspended in the following buffer:25 mM KCl, 50 μg/ml propidium iodide, 0.33% Triton X-100, 333 μg/mlRNase [Cram et al. (1990) Methods in Cell Biology 33:376].

[0527] The fluorescent dye propidium iodide is used and also serves as achromosome stabilizing agent. Swelling of the cells in the hypotonicmedium (which may also contain propidium iodide) may be monitored byplacing a small drop of the suspension on a microscope slide andobserving the cells by phase/fluorescent microscopy. The cells shouldexclude the propidium iodide while swelling, but some may lyseprematurely and show chromosome fluorescence. After the cells have beencentrifuged and resuspended in the KCl-propidium iodide buffer system,they will be lysed due to the presence of the detergent in the buffer.The chromosomes may then be dispersed and then incubated at 37° C. forup to 30 minutes to permit the RNase to act. The chromosome preparationis then analyzed by flow cytometry. The propidium iodide fluorescencecan be excited at the 488 nm wavelength of an argon laser and detectedthrough an OG 570 optical filter by a single photomultiplier tube. Thesingle pulse may be integrated and acquired in an univariate histogram.The flow cytometer may be aligned to a CV of 2% or less using small (1.5μm diameter) microspheres. The chromosome preparation is filteredthrough 60 μm nylon mesh before analysis.

[0528] C. Staining of chromosomes with DNA-specific dyes

[0529] Subsequent to isolation, the chromosome preparation was stainedwith Hoechst 33258 at 6 μg/ml and chromomycin A3 at 200 μg/ml. Fifteenminutes prior to analysis, 25 mM Na-sulphite and 10 mM Na-citrate wereadded to the chromosome suspension.

[0530] D. Flow sorting of chromosomes

[0531] Chromosomes obtained from 1B3 and GHB42 cells and maintained weresuspended in a polyamine-based sheath buffer (0.5 mM EGTA, 2.0 mM EDTA,80 mM KCl, 70 mM NaCl, 15 mM Tris-HCl, pH 7.2, 0.2 mM spermine and 0.5mM spermidine) [Sillar and Young (1981) J. Histochem. Cytochem.29:74-78]. The chromosomes were then passed through a dual-laser cellsorter [FACStar Plus or FAXStar Vantage Becton Dickinson ImmunocytometrySystem; other dual-laser sorters may also be used, such as thosemanufactured by Coulter Electronics (Elite ESP) and Cytomation (MoFlo)]in which two lasers were set to excite the dyes separately, allowing abivariate analysis of the chromosome by size and base-pair composition.Because of the difference between the base composition of the SATACs andthe other chromosomes and the resulting difference in interaction withthe dyes, as well as size differences, the SATACs were separated fromthe other chromosomes.

[0532] E. Storage of the sorted artificial chromosomes

[0533] Sorted chromosomes may be pelleted by centrifugation andresuspended in a variety of buffers, and stored at 4° C. For example,the isolated artificial chromosomes may be stored in GH buffer (100 mMglycine, 1% hexylene glycol pH 8.4-8.6 adjusted with saturatedCa-hydroxide solution) [see, e.g., Hadlaczky et al. (1982) Chromosoma86:643-659] for one day and embedded by centrifugation into agarose. Thesorted chromosomes were centrifuged into an agarose bed and the plugsare stored in 500 mM EDTA at 40° C. Additional storage buffers includeCMB-I/polyamine buffer (17.5 mM Tris-HCl, pH 7.4, 1.1 mM EDTA, 50 mMepsilon-amino caproic acid, 5 mM benzamide-HCl, 0.40 mM spermine, 1.0 mMspermidine, 0.25 mM EGTA, 40 mM KCl, 35 mM NaCl) and CMB-II/polyaminebuffer (100 mM glycine, pH 7.5, 78 mM hexylene glycol, 0.1 mM EDTA, 50mM epsilon-amino caproic acid, 5 mM benzamide-HCl, 0.40 mM spermine, 1.0mM spermidine, 0.25 mM EGTA, 40 mM KCl, 35 mM NaCl).

[0534] When microinjection is the intended use, the sorted chromosomesare stored in 30% glycerol at −20° C. Sorted chromosomes may also bestored without glycerol for short periods of time (3-6 days) in storagebuffers at 4° C. Exemplary buffers for microinjection include CBM-I (10mM Tris-HCl, pH 7.5, 0.1 mM EDTA, 50 mM epsilon-amino caproic acid, 5 mMbenzamide-HCl, 0.30 mM spermine, 0.75 mM spermidine), CBM-II (100 mMglycine, pH 7.5, 78 mM hexylene glycol, 0.1 mM EDTA, 50 mM epsilon-aminocaproic acid, 5 mM benzamide-HCl, 0.30 mM spermine, 0.75 mM spermidine).

[0535] For long-term storage of sorted chromosomes, the above buffersare preferably supplemented with 50% glycerol and stored at −20° C.

[0536] F. Quality control

[0537] 1. Analysis of the purity

[0538] The purity of the sorted chromosomes was checked by fluorescencein situ hybridization (FISH) with a biotin-labeled mouse satellite DNAprobe [see, Hadlaczky et al. (1991 ) Proc. Natl. Acad. Sci. U.S.A.88:8106-8110]. Purity of the isolated chromosomes was about 97-99%.

[0539] 2. Characteristics of the sorted chromosomes

[0540] Pulsed field gel electrophoresis and Southern hybridization werecarried out to determine the size distribution of the DNA content of thesorted artificial chromosomes.

[0541] G. Functioning of the purified artificial chromosomes

[0542] To check whether their activity is preserved, the purifiedartificial chromosomes may be microinjected (using methods such as thosedescribed in Example 13) into primary cells, somatic cells and stemcells which are then analyzed for expression of the heterologous genescarried by the artificial chromosomes, e.g., such as analysis for growthon selective medium and assays of β-galactosidase activity.

[0543] II. Sorting of mammalian artificial chromosome-containingmicrocells

[0544] A. Micronucleation

[0545] Cells were grown to 80-90% confluency in 4 T 150 flasks. Colcemidwas added to a final concentration of 0.06 μg/ml, and then incubatedwith the cells at 37° C. for 24 hours.

[0546] B. Enucleation

[0547] Ten μg/ml cytochalasin B was added and the resulting microcellswere centrifuged at 15,000 rpm for 70 minutes at 28-33° C.

[0548] C. Purification of microcells by filtration

[0549] The microcells were purified using Swinnex filter units andNucleopore filters [5 μm and 3 μm].

[0550] D. Staining and sorting microcells

[0551] As above, the cells were stained with Hoechst and chromomycin A3dyes. The microcells were sorted by cell sorter to isolate themicrocells that contain the mammalian artificial chromosomes.

[0552] E. Fusion

[0553] The microcells that contain the artificial chromosome are fused,for example, as described in Example 1.A.5., to selected primary cells,somatic cells, embryonic stem cells to generate transgenic (non-human)animals and for gene therapy purposes, and to other cells to deliver thechromosomes to the cells.

EXAMPLE 11

[0554] Introduction of Mammalian Artificial Chromosomes Into InsectCells

[0555] Insect cells are useful hosts for MACs, particularly for use inthe production of gene products, for a number of reasons, including:

[0556] 1. A mammalian artificial chromosome provides an extra-genomicspecific integration site for introduction of genes encoding proteins ofinterest [reduced chance of mutation in production system].

[0557] 2. The large size of an artificial chromosome permits megabasesize DNA integration so that genes encoding an entire pathway leading toa protein or nonprotein of therapeutic value, such as an alkaloid[digitalis, morphine, taxol] can be accomodated by the artificialchromosome.

[0558] 3. Amplification of genes encoding useful proteins can beaccomplished in the artificial mammalian chromosome to obtain higherprotein yields in insect cells.

[0559] 4. Insect cells support required post-translational modifications(glycosylation, phosphorylation) essential for protein biologicalfunction.

[0560] 5. Insect cells do not support mammalian viruses—eliminatescross-contamination of product with human infectious agents.

[0561] 6. The ability to introduce chromosomes circumvents traditionalrecombinant baculovirus systems for production of nutritional,industrial or medicinal proteins in insect cell systems.

[0562] 7. The low temperature optimum for insect cell growth (28° C.)permits reduced energy cost of production.

[0563] 8. Serum free growth medium for insect cells will result in lowerproduction costs.

[0564] 9. Artificial chromosome-containing cells can be storedindefinitely at low temperature.

[0565] 10. Insect larvae will serve as biological factories for theproduction of nutritional, medicinal or industrial proteins bymicroinjection of fertilized insect eggs.

[0566] A. Demonstration that insect cells recognize mammalian promoters

[0567] Gene constructs containing a mammalian promoter, such as the CMVpromoter, linked to a detectable marker gene [Renilla luciferase gene(see, e.g., U.S. Pat. No. 5,292,658 for a description of DNA encodingthe Renilla luciferase, and plasmid pTZrLuc-1, which can provide thestarting material for construction of such vectors, see also SEQ ID No.10] and also including the simian virus 40 (SV40) promoter operablylinked to the β-galactosidase gene were introduced into the cells of twospecies Trichoplusia ni [cabbage looper] and Bombyx mori [silk worm].

[0568] After transferring the constructs into the insect cell lineseither by electroporation or by microinjection, expression of the markergenes was detected in luciferase assays (see e.g., Example 12.C.3) andin β-galactosidase assays (such as lacZ staining assays) after a 24-hincubation. In each case a positive result was obtained in the samplescontaining the genes which was absent in samples in which the genes wereomitted. In addition, a B. mori β-actin promoter-Renilla luciferase genefusion was introduced into the T. ni and B. mori cells which yieldedlight emission after transfection. Thus, certain mammalian promotersfunction to direct expression of these marker genes in insect cells.Therefore, MACs are candidates for expression of heterologous genes ininsect cells.

[0569] B. Construction of vectors for use in insect cells and fusionwith mammalian cells

[0570] 1. Transform LMTK⁻ cells with expression vector with:

[0571] a. B. mori β-actin promoter—Hyg^(r) selectable marker gene forinsect cells, and

[0572] b. SV40 or CMV promoters controlling a puromycin^(r) selectablemarker gene for mammalian cells.

[0573] 2.Detect expression of the mammalian promoter in LMTK cells(puromycin^(r) LMTK cells)

[0574] 3. Use puromycin^(r) cells in fusion experiments with Bombyx andTrichoplusia cells, select Hyg^(r) cells.

[0575] C. Insertion of the MACs into insect cells

[0576] These experiments are designed to detect expression of adetectable marker gene [such as the β-galactosidase gene expressed underthe control of a mammalian promoter, such as pSV40] located on a MACthat has been introduced into an insect cell. Data indicate that β-galwas expressed.

[0577] Insect cells are fused with mammalian cells containing mammalianartificial chromosomes, e.g., the minichromosome [EC3/7C5] or the miniand the megachromosome [such as GHB42, which is a cell line reclonedfrom G3D5] or a cell line that carries only the megachromosome [such asH1D3 or a redone therefrom]. Fusion is carried out as follows:

[0578] 1. mammalian+insect cells (50/50%) in log phase growth are mixed;

[0579] 2. calcium/PEG cell fusion: (10 min-0.5 h);

[0580] 3. heterokaryons (+72 h) are selected.

[0581] The following selection conditions to select for insect cellsthat contain a MAC can be used: [+=positive selection; −=negativeselection]:

[0582] 1. growth at 28° C. (+insect cells, −mammalian cells);

[0583] 2. Graces insect cell medium [SIGMA] (−mammalian cells);

[0584] 3. no exogenous CO₂ (−mammalian cells); and/or

[0585] 4. antibiotic selection (Hyg or G418) (+transformed insectcells).

[0586] Immediately following the fusion protocol, many heterokaryons[fusion events] are observed between the mammalian and each species ofinsect cells [up to 90% heterokaryons]. After growth [2+weeks] on insectmedium containing G418 and/or hygromycin at selection levels used forselection of transformed mammalian cells, individual colonies aredetected growing on the fusion plates. By virtue of selection for theantibiotic resistance conferred by the MAC and selection for insectcells, these colonies should contain MACs.

[0587] The B. mori β-actin gene promoter has been shown to directexpression of the β-galactosidase gene in B. mori cells and mammaliancells (e.g, EC3/7C5 cells). The B. mori β-actin gene promoter is, thus,particularly useful for inclusion in MACs generated in mammalian cellsthat will subsequently be transferred into insect cells because thepresence of any marker gene linked to the promoter can be determined inthe mammalian and resulting insect cell lines.

EXAMPLE 12

[0588] Preparation of chromosome fragmentation vectors and other vectorsfor targeted integration of DNA into MACs

[0589] Fragmentation of the megachromosome should ultimately result insmaller stable chromosomes that contain about 15 Mb to 50 Mb that willbe easily manipulated for use as vectors. Vectors to effect suchfragmentation should also aid in determination and identification of theelements required for preparation of an in vitro-produced artificialchromosome.

[0590] Reduction in the size of the megachromosome can be achieved in anumber of different ways including: stress treatment, such as bystarvation, or cold or heat treatment; treatment with agents thatdestabilize the genome or nick DNA, such as BrdU, coumarin, EMS andothers; treatment with ionizing radiation [see, e.g., Brown (1992) Curr.Opin. Genes Dev. 2:479-486]; and telomere-directed in vivo chromosomefragmentation [see, e.g., Farr et al. (1995) EMBO J. 14:5444-5454].

[0591] A. Preparation of vectors for fragmentation of the artificialchromosome and also for targeted integration of selected gene products

[0592] 1. Construction of pTEMPUD

[0593] Plasmid pTEMPUD [see FIG. 5] is a mouse homologous recombination“killer” vector for in vivo chromosome fragmentation, and also forinducing large-scale amplification via site-specific integration. Withreference to FIG. 5, the ˜3,625-bp SalI-PstI fragment was derived fromthe pBabe-puro retroviral vector [see, Morgenstern et al (1990) NucleicAcids Res. 18:3587-3596]. This fragment contains DNA encoding ampicillinresistance, the pUC origin of replication, and the puromycin N-acetyltransferase gene under control of the SV40 early promoter. The URA3 geneportion comes from the pYAC5 cloning vector [SIGMA]. URA3 was cut out ofpYAC5 with SalI-XhoI digestion, cloned into pNEB193 [New EnglandBiolabs], which was then cut with EcoRI-SalI and ligated to the SalIsite of pBabepuro to produce pPU.

[0594] A 1293-bp fragment [see SEQ ID No. 1] encoding the mouse majorsatellite, was isolated as an EcoRI fragment from a DNA library producedfrom mouse LMTK⁻ fibroblast cells and inserted into the EcoRI site ofpPU to produce pMPU.

[0595] The TK promoter-driven diphtheria toxin gene [DT-A] was derivedfrom pMC1DT-A [see, Maxwell et al. (1986) Cancer Res. 46:4660-4666] byBgIII-XhoI digestion and cloned into the pMC1neo poly A expressionvector [STRATAGENE, La Jolla, Calif.] by replacing theneomycin-resistance gene coding sequence. The TK promoter, DT-A gene andpoly A sequence were removed from this vector, cohesive ends were filledwith Klenow and the resulting fragment blunt end-ligated and ligatedinto the SnaBI [TACGTA] of pMPU to produce pMPUD.

[0596] The Hutel 2.5-kb fragment [see SEQ ID No. 3] was inserted at thePstI site [see the 6100 PstI-3625 PstI fragment on pTEMPUD] of pMPUD toproduce pTEMPUD. This fragment includes a human telomere. It includes aunique BgIII site [see nucleotides 1042-1047 of SEQ ID No. 3], whichwill be used as a site for introduction of a synthetic telomere thatincludes multiple repeats [80] of TTAGGG with BamHI and BgIII ends forinsertion into the BgIII site which will then remain unique, since theBamHI overhang is compatible with the BgIII site. Ligation of a BamHIfragment to a BgIII destroys the BgIII site, so that only a single BgIIIsite will remain. Selection for the unique BgIII site insures that thesynthetic telomere will be inserted in the correct orientation. Theunique BgIII site is the site at which the vector is linearized.

[0597] To generate a synthetic telomere made up of multiple repeats ofthe sequence TTAGGG, attempts were made to clone or amplify ligationproducts of 30-mer oligonucleotides containing repeats of the sequence.Two 30-mer oligonucleotides, one containing four repeats of TTAGGGbounded on each end of the complete run of repeats by half of a repeatand the other containing five repeats of the complement AATCCC, wereannealed. The resulting double-standed molecule with 3-bp protrudingends, each representing half of a repeat, was expected to ligate withitself to yield concatamers of n×30 bp. However, this approach wasunsuccessful, likely due to formation of quadruplex DNA from the G-richstrand. Similar difficulty has been encountered in attempts to generatelong repeats of the pentameric human satellite II and III units. Thus,it appears that, in general, any oligomer sequence containingperiodically spaced consecutive series of guanine nucleotides is likelyto form undesired quadruplex formation that hinders construction of longdouble-stranded DNAs containing the sequence.

[0598] Therefore, in another attempt to construct a synthetic telomerefor insertion into the BgIII site of pTEMPUD, the starting material wasbased on the complementary C-rich repeat sequence (i.e., AATCCC) whichwould not be susceptible to quadruplex structure formation. Twoplasmids, designated pTEL280110 and pTel280111, were constructed asfollows to serve as the starting materials.

[0599] First, a long oligonucleotide containing 9 repeats of thesequence AATCCC (i.e., the complement of telomere sequence TTAGGG) inreverse order bounded on each end of the complete run of repeats by halfof a repeat (therefore, in essence, containing 10 repeats), andrecognition sites for PstI and PacI restriction enzymes was synthesizedusing standard methods. The oligonucleotide sequence is as follows:5′-AAACTGCAGGTTAATTAACCCTAACCCTAACCC (SEQ ID NO.29)TAACCCTAACGCTAACCCTAACCCTAACCCTAACCC TAACCCGGGAT-3′

[0600] A partially complementary short oligonucleotide of sequence

[0601] 3′-TTGGGCCCTAGGCTTAAGG-5′ (SEQ ID NO. 30)

[0602] was also synthesized. The oligonucleotides were gel-purified,annealed, repaired with Klenow polymerase and digested with EcoRI andPstI. The resulting EcoRI/PstI fragment was ligated withEcoRI/PstI-digested pUC19. The resulting plasmid was used to transformE. coli DH5α competent cells and plasmid DNA (pTel102) from one of thetransformants surviving selection on LB/ampicillin was digested withPacI, rendered blunt-ended by Klenow and dNTPs and digested withHindIII. The resulting 2.7-kb fragment was gel-purified.

[0603] Simultaneously, the same plasmid was amplified by the polymerasechain reaction using extended and more distal 26-mer M13 sequencingprimers. The amplification product was digested with SmaI and HindIII,the double-stranded 84-bp fragment containing the 60-bp telomeric repeat(plus 24 bp of linker sequence) was isolated on a 6% nativepolyacrylamide gel, and ligated with the double-digested pTel102 toyield a 120-bp telomeric sequence. This plasmid was used to transformDH5α cells. Plasmid DNA from two of the resulting recombinants thatsurvived selection on ampicillin (100 μg/ml) was sequenced on an ABI DNAsequencer using the dye-termination method. One of the plasmids,designated pTel29, contained a sequence of 20 repeats of the sequenceTTAGGG (i.e., 19 successive repeats of TTAGGG bounded on each end of thecomplete run of repeats with half of a repeat). The other plasmid,designated pTel28, had undergone a deletion of 2 bp (TA) at the junctionwhere the two sequences, each containing, in essence, 10 repeats of theTTAGGG sequence, that had been ligated to yield the plasmid. Thisresulted in a GGGTGGG motif at the junction in pTel28. This mutationprovides a useful tag in telomere-directed chromosome fragmentationexperiments. Therefore, the pTel29 insert was amplified by PCR usingpUC/M13 sequencing primers based on sequence somewhat longer and fartherfrom the polylinker than usual as follows:

[0604] 5′-GCCAGGGTTTTCCCAGTCACGACGT-3′ (SEQ ID NO. 31)

[0605] or in some experiments

[0606] 5′-GCTGCAAGGCGATTAAGTTGGGTAAC-3′ (SEQ ID NO. 32)

[0607] as the m13 forward primer, and

[0608] 5′-TATGTTGTGTGGAATTGTGAGCGGAT-3′ (SEQ ID NO. 33)

[0609] as the m13 reverse primer.

[0610] The amplification product was digested with SmaI and HindIII. Theresulting 144-bp fragment was gel-purified on a 6% native polyacrylamidegel and ligated with pTel28 that had been digested with PacI,blunt-ended with Klenow and dNTP and then digested with HindIII toremove linker. The ligation yielded a plasmid designated pTel2801containing a telomeric sequence of 40 repeats of the sequence TTAGGG inwhich one of the repeats (i.e., the 30th repeat) lacked two nucleotides(TA), due to the deletion that had occurred in pTel28, to yield a repeatas follows: TGGG.

[0611] In the next extension step, pTel2801 was digested with SmaI andHindIII and the 264-bp insert fragment was gel-purified and ligated withpTel2801 which had been digested with PacI, blunt-ended and digestedwith HindIII. The resulting plasmid was transformed into DH5α cells andplasmid DNA from 12 of the resulting transformants that survivedselection on ampicillin was examined by restriction enzyme analysis forthe presence of a 0.5-kb EcoRI/PstI insert fragment. Eleven of therecombinants contained the expected 0.5-kb insert. The inserts of two ofthe recombinants were sequenced and found to be as expected. Theseplasmids were designated pTel280110 and pTel280111. These plasmids,which are identical, both contain 80 repeats of the sequence TTAGGG, inwhich two of the repeats (i.e., the 30th and 70th repeats) lacked twonucleotides (TA), due to the deletion that had occurred in pTel28, toyield a repeat as follows: TGGG. Thus, in each of the cloning steps(except the first), the length of the synthetic telomere doubled; thatis, it was increasing in size exponentially. Its length was 60×2^(n) bp,wherein n is the number of extension cloning steps undertaken.Therefore, in principle (assuming E. coli, or any other microbial host,e.g., yeast, tolerates long tandem repetitive DNA), it is possible toassemble any desirable size of safe telomeric repeats.

[0612] In a further extension step, pTel280110 was digested with PacI,blunt-ended with Klenow polymerase in the presence of dNTP, thendigested with HindIII. The resulting 0.5-kb fragment was gel purified.Plasmid pTel280111 was cleaved with SmaI and HindIII and the 3.2-kbfragment was gel-purified and ligated to the 0.5-kb fragment frompTel280110. The resulting plasmid was used to transform DH5α cells.Plasmid DNA was purified from transformants surviving ampicillinselection. Nine of the selected recombinants were examined byrestriction enzyme analysis for the presence of a 1.0-kb EcoRI/PstIfragment. Four of the recombinants (designated pTlk2, pTlk6, pTlk7 andpTlk8) were thus found to contain the desired 960 bp telomere DNA insertsequence that included 160 repeats of the sequence TTAGGG in which fourof the repeats lacked two nucleotides (TA), due to the deletion that hadoccurred in pTel28, to yield a repeat as follows: TGGG. Partial DNAsequence analysis of the EcoRI/PstI fragment of two of these plasmids(i.e., pTlk2 and pTlk6), in which approximately 300 bp from both ends ofthe fragment were elucidated, confirmed that the sequence was composedof successive repeats of the TTAGGG sequence.

[0613] In order to add PmeI and BgIII sites to the synthetic telomeresequence, pTlk2 was digested with PacI and PstI and the 3.7-kb fragment(i.e., 2.7-kb pUC19 and 1.0-kb repeat sequence) was gel-purified andligated at the PstI cohesive end with the following oligonucleotide5′-GGGTTTAAACAGATCTCTGCA-3′ (SEQ ID NO. 34). The ligation product wassubsequently repaired with Klenow polymerase and dNTP, ligated to itselfand transformed into E. coli strain DH5α. A total of 14 recombinantssurviving selection on ampicillin were obtained. Plasmid DNA from eachrecombinant was able to be cleaved with BgIII indicating that this addedunique restriction site had been retained by each recombinant. Four ofthe 14 recombinants contained the complete 1-kb synthetic telomereinsert, whereas the insert of the remaining 10 recombinants hadundergone deletions of various lengths. The four plasmids in which the1-kb synthetic telomere sequence remained intact were designated pTlkV2,pTlkV5, pTlkV8 an pTlkV12. Each of these plasmids could also be digestedwith PmeI; in addition the presence of both the BgIII nad PmeI sites wasverified by sequence analysis. Any of these four plasmids can bedigested with BamHI and BgIII to release a fragment containing the 1-kbsynthetic telomere sequence which is then ligated with BgIII-digestedpTEMPUD.

[0614] 2. Use of pTEMPUD for in vivo chromosome fragmentation

[0615] Linearization of pTEMPUD by BgIII results in a linear moleculewith a human telomere at one end. Integration of this linear fragmentinto the chromosome, such as the megachromosome in hybrid cells or anymouse chromosome which contains repeats of the mouse major satellitesequence results in integration of the selectable markerpuromycin-resistance gene and cleavage of the plasmid by virtue of thetelomeric end. The DT gene prevents that entire linear fragment fromintegrating by random events, since upon integration and expression itis toxic. Thus random integration will be toxic, so site-directedintegration into the targeted DNA will be selected. Such integrationwill produce fragmented chromosomes.

[0616] The fragmented truncated chromosome with the new telomere willsurvive, and the other fragment without the centromere will be lost.Repeated in vivo fragmentations will ultimately result in selection ofthe smallest functioning artificial chromosome possible. Thus, thisvector can be used to produce minichromosomes from mouse chromosomes, orto fragment the megachromosome. In principle, this vector can be used totarget any selected DNA sequence in any chromosome to achievefragmentation.

[0617] 3. Construction of pTERPUD

[0618] A fragmentation/targeting vector analogous to pTEMPUD for in vivochromosome fragmentation, and also for inducing large-scaleamplification via site-specific integration but which is based on mouserDNA sequence instead of mouse major satellite DNA has been designatedpTERPUD. In this vector, the mouse major satellite DNA sequence ofpTEMPUD has been replaced with a 4770-bp BamHI fragment ofmegachromosome clone 161 which contains sequence corresponding tonucleotides 10,232-15,000 in SEQ ID NO. 16.

[0619]4. pHASPUD and pTEMPhu3

[0620] Vectors that specifically target human chromosomes can beconstructed from pTEMPUD. These vectors can be used to fragment specifichuman chromosomes, depending upon the selected satellite sequence, toproduce human minichromosomes, and also to isolate human centromeres.

[0621] a. pHASPUD

[0622] To render pTEMPUD suitable for fragmenting human chromosomes, themouse major satellite sequence is replaced with human satellitesequences. Unlike mouse chromosomes, each human chromosome has a uniquesatellite sequence. For example, the mouse major satellite has beenreplaced with a human hexameric α-satellite [or alphoid satellite] DNAsequence. This sequence is an 813-bp fragment [nucleotide 232-1044 ofSEQ ID No. 2] from clone pS12, deposited in the EMBL database underAccession number X60716, isolated from a human colon carcinoma cell lineColo320 [deposited under Accession No. ATCC CCL 220.1]. The 813-bpalphoid fragment can be obtained from the pS12 clone by nucleic acidamplification using synthetic primers, each of which contains an EcoRIsite, as follows: GGGGAATTCAT TGGGATGTTT CAGTTGA [SEQ ID No.4] forwardprimer CGAAAGTCCCC CCTAGGAGAT CTTAAGGA. [SEQ ID No.5] reverse primer

[0623] Digestion of the amplified product with EcoRI results in afragment with EcoRI ends that includes the human α-satellite sequence.This sequence is inserted into pTEMPUD in place of the EcoRI fragmentthat contains the mouse major satellite to yield pHASPUD.

[0624] Vector pHASPUD was linearized with BgIII and used to transformEJ30 (human fibroblast) cells by scrape loading. Twenty-sevenpuromycin-resistant transformant strains were obtained.

[0625] b. pTEMPhu3

[0626] In pTEMPhu3, the mouse major satellite sequence is replaced bythe 3kb human chromosome 3-specific α-satellite from D3Z1 [depositedunder ATCC Accession No. 85434; see, also Yrokov (1989) Cytogenet. CellGenet. 51:1114].

[0627] 5. Use of the pTEMPHU3 to induce amplification on humanchromosome #3

[0628] Each human chromosome contains unique chromosome-specific alphoidsequence. Thus, pTEMPHU3, which is targeted to the chromosome 3-specifica-satellite, can be introduced into human cells under selectiveconditions, whereby large-scale amplification of the chromosome 3centromeric region and production of a de novo chromosome ensues. Suchinduced large-scale amplification provides a means for inducing de novochromosome formation and also for in vivo cloning of defined humanchromosome fragments up to megabase size.

[0629] For example, the break-point in human chromosome 3 is on theshort arm near the centromere. This region is involved in renal cellcarcinoma formation. By targeting pTEMPhu3 to this region, the inducedlarge-scale amplification may contain this region, which can then becloned using the bacterial and yeast markers in the pTEMPhu3 vector.

[0630] The pTEMPhu3 cloning vector allows not only selection forhomologous recombinants, but also direct cloning of the integration sitein YACS. This vector can also be used to target human chromosome 3,preferably with a deleted short arm, in a mouse-human monochromosomalmicrocell hybrid line. Homologous recombinants can be screened bynucleic acid amplification (PCR), and amplification can be screened byDNA hybridization, Southern hybridization, and in situ hybridization.The amplified region can be cloned into a YAC. This vector and thesemethods also permit a functional analysis of cloned chromosome regionsby reintroducing the cloned amplified region into mammalian cells.

[0631] B. Preparation of libraries in YAC vectors for cloning ofcentromeres and identification of functional chromosomal units

[0632] Another method that may be used to obtain smaller-sizedfunctional mammalian artificial chromosome units and to clonecentromeric DNA involves screening of mammalian DNA YAC vector-basedlibraries and functional analysis of potential positive clones in atransgenic mouse model system. A mammalian DNA library is prepared in aYAC vector, such as YRT2 [see Schedl et al. (1993) Nuc. Acids Res.21:4783-4787], which contains the murine tyrosinase gene. The library isscreened for hybridization to mammalian telomere and centromere sequenceprobes. Positive clones are isolated and microinjected into pronuclei offertilized oocytes of NMRI/Han mice following standard techniques. Theembryos are then transferred into NMRI/Han foster mothers. Expression ofthe tyrosinase gene in transgenic offspring confers an identifiablephenotype (pigmentation). The clones that give rise totyrosinase-expressing transgenic mice are thus confirmed as containingfunctional mammalian artificial chromosome units.

[0633] Alternatively, fragments of SATACs may be introduced into the YACvectors and then introduced into pronuclei of fertilized oocytes ofNMRI/Han mice following standard techniques as above. The clones thatgive rise to tyrosinase-expressing transgenic mice are thus confirmed ascontaining functional mammalian artificial chromosome units,particularly centromeres.

[0634] C. Incorporation of Heterologous Genes into Mammalian ArtificialChromosomes through The Use of Homology Targeting Vectors

[0635] As described above, the use of mammalian artificial chromosomesfor expression of heterologous genes obviates certain negative effectsthat may result from random integration of heterologous plasmid DNA intothe recipient cell genome. An essential feature of the mammalianartificial chromosome that makes it a useful tool in avoiding thenegative effects of random integration is its presence as anextra-genomic gene source in recipient cells. Accordingly, methods ofspecific, targeted incorporation of heterologous genes exclusively intothe mammalian artificial chromosome, without extraneous randomintegration into the genome of recipient cells, are desired forheterologous gene expression from a mammalian artificial chromosome.

[0636] One means of achieving site-specific integration of heterologousgenes into artificial chromosomes is through the use of homologytargeting vectors. The heterologous gene of interest in subcloned into atargeting vector which contains nucleic acid sequences that arehomologous to nucleotides present in the artificial chromosome. Thevector is then introduced into cells containing the artificialchromosome for specific site-directed integration into the artificialchromosome through a recombination event at sites of homology betweenthe vector and the chromosome. The homology targeting vectors may alsocontain selectable markers for ease of identifying cells that haveincorporated the vector into the artificial chromosome as well as lethalselection genes that are expressed only upon extraneous integration ofthe vector into the recipient cell genome. Two exemplary homologytargeting vectors, λCF-7 and pλCF-7-DTA, are described below.

[0637] 1. Construction of Vector λCF-7

[0638] Vector λCF-7 contains the cystic fibrosis transmembraneconductance regulator [CFTR] gene as an exemplary therapeuticmolecule-encoding nucleic acid that may be incorporated into mammalianartificial chromosomes for use in gene therapy applications. Thisvector, which also contains the puromycin-resistance gene as aselectable marker, as well as the Saccharomyces cerevisiae ura3 gene[orotidine-5-phosphate decarboxylase], was constructed in a series ofsteps as follows.

[0639] a. Construction of pURA

[0640] Plasmid pURA was prepared by ligating a 2.6-kb SalI/XhoI fragmentfrom the yeast artificial chromosome vector pYAC5 [Sigma; see also Burkeet al. (1987) Science 236:806-812 for a description of YAC vectors aswell as GenBank Accession no. U01086 for the complete sequence of pYAC5]containing the S. cerevisiae ura3 gene with a 3.3-kb SalI/SmaI fragmentof pHyg [see, e.g., U.S. Pat. Nos. 4,997,764, 4,686,186 and 5,162,215,.and the description above]. Prior to ligation the XhoI end was treatedwith Klenow polymerase for blunt end ligation to the SmaI end of the 3.3kb fragment of pHyyg. Thus, pURA contains the S. cerevisiae ura3 gene,and the E. coli CoIE1 origin of replication and theampicillin-resistance gene. The uraE gene is included to provide a meansto recover the integrated construct from a mammalian cell as a YACclone.

[0641] b. Construction of pUP2

[0642] Plasmid pURA was digested with SalI and ligated to a 1.5-kb SalIfragment of pCEPUR. Plasmid pCEPUR is produced by ligating the 1.1 kbSnaBI-NhaI fragment of pBabe-puro [Morgenstern et al. (1990) Nucl. AcidsRes. 18:3587-3596; provided by Dr. L. Székely (Microbiology andTumorbiology Center, Karolinska Institutet, Stockholm); see, alsoTonghua et al (1995) Chin. Med. J. (Beijing, Engl. Ed.) 108:653-659;Couto et al. (1994) Infect. Immun. 62:2375-2378;Dunckley et al. (1992)FEBS Lett. 296:128-34; French et al. (1995) Anal. Biochem. 228:354-355;Liu et al. (1995) Blood 85:1095-1103; International PCT application Nos.WO 9520044; WO 9500178, and WO 94194561 to the NheI-NruI fragment ofpCEP4 [Invitrogen].

[0643] The resulting plasmid, pUP2, contains the all the elements ofpURA plus the puromycin-resistance gene linked to the SV40 promoter andpolyadenylation signal from pCEPUR.

[0644] C. Construction of pUP-CFTR

[0645] The intermediate plasmid pUP-CFTR was generated in order tocombine the elements of pUP2 into a plasmid along with the CFTR gene.First, a 4.5-kb Sall fragment of pCMV-CFTR that contains theCFTR-encoding DNA [see, also, Riordan et al. (1989) Science245:1066-1073, U.S. Pat. No. 5,240,846, and Genbank Accession no. M28668for the sequence of the CFTR gene] containing the CFTR gene only wasligated to XhoI-digested pCEP4 [Invitrogen and also described herein] inorder to insert the CFTR gene in the multiple cloning site of theEpstein Barr virus-based (EBV) vector pCEP4 [Invitrogen, San Diego,Calif.; see also Yates et al. (1985) Nature 313:812-815; see, also U.S.Pat. No. 5,468,615] between the CMV promoter and SV40 polyadenylationsignal. The resulting plasmid was designated pCEP-CFTR. PlasmidpCEP-CFTR was then digested with SalI and the 5.8-kb fragment containingthe CFTR gene flanked by the CMV promoter and SV40 polyadenylationsignal was ligated to SalI-digested pUP2 to generate pUP-CFTR. Thus,pUP-CFTR contains all elements of pUP2 plus the CFTR gene linked to theCMV promoter and SV40 polyadenylation signal.

[0646] d. Construction of λCF-7

[0647] Plasmid pUP-CFTR was then linearized by partial digestion withEcoRI and the 13 kb fragment containing the CFTR gene was ligated withEcoRI-digested Charon 4A λ [see Blattner et al. (1977) Science 196:161;Williams and Blattner (1979) J. Virol. 29:555 and Sambrook et al. (1989)Molecular Cloning, A Laboratory Manual, Second Ed., Cold Spring HarborLaboratory Press, Volume 1, Section 2.18, for descriptions of Charon 4Aλ ]. The resulting vector, λCF8, contains the Charon 4A λ bacteriophageleft arm, the CFTR gene linked to the CMV promoter and SV40polyadenylation signal, the ura3 gene, the puromycin-resistance genelinked to the SV40 promoter and polyadenylation signal, the thymidinekinase promoter [TK], the CoIE1 origin of replicaton, the amplicillanresistance gene and the Charon 4A λ bacteriophage right arm. The λCF8construct was then digested with XhoI and the resulting 27.1 kb wasligated to the 0.4 kb XhoI/EcoRI fragment of pJBP86 [described below],containing the SV40 polyA signal and the EcoRI-digested Charon 4A λright arm. The resulting vector λCF-7 contains the Charon 4A λ left arm,the CFTR encoding DNA linked to the CMV promoter and SV40 polyA signal,the ura3 gene, the puromycin resistance gene linked to the SV40 promoterand polyA signal and the Charon 4A λ right arm. The λ DNA fragmentsprovide encode sequences homologous to nucleotides present in theexemplary artificial chromosomes.

[0648] The vector is then introduced into cells containing theartificial chromosomes exemplified herein. Accordingly, when the linearλCF-7 vector is introduced into megachromosome-carrying fusion celllines, such as described herein, it will be specifically integrated intothe megachromosome through recombination between the homologousbacteriophage A sequences of the vector and the artificial chromosome.

[0649] 2. Construction of Vector λCF-7-DTA

[0650] Vector λCF-7-DTA also contains all the elements contained inλCF-7, but additionally contains a lethal selection marker, thediptheria toxin-A (DT-A) gene as well as the ampicillin-resistance geneand an origin of replication. This vector was constructed in a series ofsteps as follows.

[0651] a. Construction of pJBP86

[0652] Plasmid pJBP86 was used in the construction of λCF-7, above. A1.5-kb SalI fragment of pCEPUR containing the puromycin-resistance genelinked to the SV40 promoter and polyadenylation signal was ligated toHindIII-digested pJB8 [see, e.g., Ish-Horowitz et al. (1981) NucleicAcids Res. 9:2989-2998; available from ATCC as Accession No. 37074;commercially available from Amersham, Arlington Heights, Ill]. Prior toligation the SalI ends of the 1.5 kb fragment of pCEPUR and th4 HindIIIlinearized pJB8 ends were treated with Klenow polymerase. The resultingvector pJBP86 contains the puromycin resistance gene linked to the SV40promoter and polyA signal, the 1.8 kb COS region of Charon 4Aλ, theCoIE1 origin of replication and the ampicillin resistance gene.

[0653] b. Construction of pMEP-DTA

[0654] A 1.1-kb XhoI/SalI fragment of pMC1-DT-A [see, e.g., Maxwell etal. (1986) Cancer Res. 46:4660-4666] containing the diptheria toxin-Agene was ligated to XhoI-digested pMEP4 [Invitrogen, San Diego, Calif.]to generate pMEP-DTA. To produce pMC1-DT-A, the coding region of the DTAgene was isolated as a 800 bp PstIHindIII fragment from p2249-1 andinserted into pMC1neopolyA [pMC1 available from Stratagene] in place ofthe neo gene and under the control of the TK promotoer. The resultingconstruct pMC1DT-A was digested with HindIII, the ends filled by Klenowand SalI linkers were ligated to produce a 1061 bp TK-DTA gene cassettewith an XhoI end [5′] and a SalI end containing the 270 bp TK promoterand the ˜790 bp DT-A fragment. This fragment was ligated intoXhoI-digested pMEP4 .

[0655] Plasmid pMEP-DTA thus contains the DT-A gene linked to the TKpromoter and SV40, CoIE1 origin of replication and theampicillin-resistance gene.

[0656] C. Construction of pJB83-DTA9

[0657] Plasmid pJB8 was digested with HindIII and ClaI and ligated withan oligonucleotide [see SEQ ID NOs. 7 and 8 for the sense and antisensestrands of the oligonucleotide, respectively] to generate pJB83.

[0658] The oligonucleotide that was ligated to ClaI/HindIII-digestedpJB8 contained the recognition sites of SwaI, PacI and SrfI restrictionendonucleases. These sites will permit ready linearization of thepλCF-7-DTA construct.

[0659] Next, a 1.4-kb XhoI/SalI fragment of pMEP-DTA, containing theDT-A gene was ligated to SalI-digested pJB83 to generate pJB83-DTA9.

[0660] d. Construction of λCF-7-DTA

[0661] The 12-bp overhangs of λCF-7 were removed by Mung bean nucleaseand subsequent T4 polymerase treatments. The resulting 41.1-kb linearλCF-7 vector was then ligated to pFB83-DTA9 which had been digested withClaI and treated with T4 polymerase. The resulting vector, λCF-7-DTA,contains all the elements of λCF-7 as well as the DT-A gene linked tothe TK promoter and the SV40 polyadenylation signal, the 1.8 kB Charon4A λ COS region, the ampicillin-resistance gene[from pJB83-DTA9] and theCol E1 origin of replication [from pJB83-DT9A].

[0662] D. Targeting Vectors using Luciferase Markers: Plasmid pMCT-RUC

[0663] Plasmid pMCT-RUC [14 kbp] was constructed for site-specifictargeting of the Renilla luciferase [see, e.g., U.S. Pat. Nos. 5,292,658and 5,418,155 for a description of DNA encoding Renilla luciferase, andplasmid pTZrLuc-1, which can provide the starting material forconstruction of such vectors] gene to a mammalian artificial chromosome.The relevant features of this plasmid are the Renilla luciferase geneunder transcriptional control of the human cytomegalovirusimmediate-early gene enhancer/promoter; the hygromycin-resistance genea, positive selectable marker, under the transcriptional control of thethymidine kinase promoter. In particular, this plasmid contains plasmidpAG60 [see, e.g., U.S. Pat. Nos. 5,118,620, 5,021,344, 5,063,162 and4,946,952; see, also Colbert-Garapin et a[. (1981) J. Mol. Biol.150:1-14], which includes DNA (i.e., the neomycin-resistance gene)homologous to the minichromosome, as well as the Renilla andhygromycin-resistance genes, the HSV-tk gene under control of the tkpromoter as a negative selectable marker for homologous recombination,and a unique HpaI site for linearizing the plasmid.

[0664] This construct was introduced, via calcium phosphatetransfection, into EC3/7C5 cells [see, Lorenz et al. (1996) J. Biolum.Chemilum. 11:31-37]. The EC3/7C5 cells were maintained as a monolayer[see, Gluzman (1981) Cell 23:175-183]. Cells at 50% confluency in 100 mmPetri dishes were used for calcium phosphate transfection [see, Harperet al. (1981) Chromosoma 83:431-439] using 10 μg of linearized pMCT-RUCper plate. Colonies originating from single transfected cells wereisolated and maintained in F-12 medium containing hygromycin (300 μg/mL)and 10% fetal bovine serum. Cells were grown in 100 mm Petri dishesprior to the Renilla luciferase assay.

[0665] The Renilla luciferase assay was performed [see, e.g., Matthewset al. (1977) Biochemistry 16:85-91]. Hygromycin-resistant cell linesobtained after transfection of EC3/7C5 cells with linearized plasmidpMCT-RUC [“B” cell lines] were grown to 100% confluency for measurementsof light emission in vivo and in vitro. Light emission was measured invivo after about 30 generations as follows: growth medium was removedand replaced by 1 mL RPMI 1640 containing coelenterazine [1 mmol/L finalconcentration]. Light emission from cells was then visualized by placingthe Petri dishes in a low light video image analyzer [HamamatsuArgus-100]. An image was formed after 5 min. of photon accumulationusing 100% sensitivity of the photon counting tube. For measuring lightemission in vitro, cells were trypsinized and harvested from one Petridish, pelleted, resuspended in 1 mL assay buffer [0.5 mol/L NaCl, 1mmol/L EDTA, 0.1 mol/L potassium phosphate, pH 7.4] and sonicated on icefor 10 s. Lysates were than assayed in a Turner TD-20e luminometer for10 s after rapid injection of 0.5 mL of 1 mmol/L coelenterazine, and theaverage value of light emission was recorded as LU [1 LU=1.6×106 hu/sfor this instrument].

[0666] Independent cell lines of EC3/7C5 cells transfected withlinearized plasmid pMCT-RUC showed different levels of Renillaluciferase activity. Similar differences in light emission were observedwhen measurements were performed on lysates of the same cell lines. Thisvariation in light emission was probably due to a position effectresulting from the random integration of plasmid pMCT-RUC into the mousegenome, since enrichment for site targeting of the luciferase gene wasnot performed in this experiment.

[0667] To obtain transfectant populations enriched in cells in which theluciferase gene had integrated into the minichromosome, transfectedcells were grown in the presence of ganciclovis. This negative selectionmedium selects against cells in which the added pMCT-RUC plasmidintegrated into the host EC3/7C5 genome. This selection thereby enrichesthe surviving transfectant population with cells containing pMCT-RUC inthe minichromosome. The cells surviving this selection were evaluated inluciferase assays which revealed a more uniform level of luciferaseexpression. Additionally, the results of in situ hybridization assaysindicated that the Renilla luciferase gene was contained in theminichromosome in these cells, which further indicates successfultargeting of pMCT-RUC into the minichromosome.

[0668] Plasmid pNEM-1, a variant of pMCT-RUC which also contains λ DNAto provide an extended region of homology to the minichromosome A[see,other targeting vectors, below], was also used to transfect EC3/7C5cells. Site-directed targeting of the Renilla luciferase gene and thehygromycin-resistance gene in pNEM-1 to the minichromosome in therecipient EC3/7C5 cells was achieved. This was verified by DNAamplification analysis and by in situ hybridization. Additionally,luciferase gene expression was confirmed in luciferase assays of thetransfectants.

[0669] E. Protein Secretion Targeting Vectors

[0670] Isolation of heterologous proteins produced intracellularly inmammalian cell expression systems requires cell disruption underpotentially harsh conditions and purification of the recombinant proteinfrom cellular contaminants. The process of protein isolation may begreatly facilitated by secretion of the recombinantly produced proteininto the extracellular medium where there are fewer contaminants toremove during purification. Therefore, secretion targeting vectors havebeen constructed for use with the mammalian artificial chromosomesystem.

[0671] A useful model vector for demonstrating production and secretionof heterologous protein in mammalian cells contains DNA encoding areadily detectable reporter protein fused to an efficient secretionsignal that directs transport of the protein to the cell membrane andsecretion of the protein from the cell. Vectors pLNCX-ILRUC andpLNCX-ILRUCλ, described below, are examples of such vectors. Thesevectors contain DNA encoding an interleukin-2 (IL2) signalpeptide-Renilla reniformis luciferase fusion protein. The IL-2 signalpeptide [encoded by the sequence set forth in SEQ ID No. 9] directssecretion of the luciferase protein, to which it is linked, frommammalian cells. Upon secretion from the host mammalian cell, the IL-2signal peptide is cleaved from the fusion protein to deliver mature,active, luciferase protein to the extracellular medium. Successfulproduction and secretion of this heterologous protein can be readilydetected by performing luciferase assays which measure the light emittedupon exposure of the medium to the bioluminescent luciferin substrate ofthe luciferase enzyme. Thus, this feature will be useful when artificialchromosomes are used for gene therapy. The presence of a functionalartificial chromosome carrying an IL-Ruc fusion with the accompanyingtherapeutic genes will be readily monitored. Body fluids or tissues canbe sampled and tested for luciferase expression by adding luciferin andappropriate cofactors and observing the bioluminescence.

[0672] 1. Construction of Protein Secretion Vector pLNCX-ILRUC VectorpLNCX-ILRUC contains a human IL-2 signal peptide-R. reniformis fusiongene linked to the human cytomegalovirus (CMV) immediate early promoterfor constitutive expression of the gene in mammalian cells. Theconstruct was prepared as follows.

[0673] a. Preparation of the IL-2 signal sequence-encoding DNA

[0674] A 69-bp DNA fragment containing DNA encoding the human IL-2signal peptide was obtained through nucleic acid amplification, usingappropriate primers for IL-2, of an HEK 293 cell line (see, e.g, U.S.Pat. No. 4,518,584 for an IL-2 encoding DNA; see, also SEQ ID No. 9; theIL-2 gene and corresponding amino acid sequence is also provided in theGenbank Sequence Database as accession nos. K02056 and J00264]. Thesignal peptide includes the first 20 amino acids shown in thetranslations provided in both of these Genbank entries and in SEQ ID NO.9. The corresponding nucleotide sequence encoding the first 20 aminoacids is also provided in these entries [see, e.g., nucleotides 293-52of accession no. K02056 and nucleotides 478-537 of accession no.J00264), as well as in SEQ ID NO. 9. The amplification primers includedan EcoRI site [GAATTC] for subcloning of the DNA fragment after ligationinto pGEMT [Promega]. The forward primer is set forth in SEQ ID No. 11and the sequence of the reverse primer is set forth in SEQ ID No. 12.TTTGAATTCATGTACAGGATGCAACTCCTG [SEQ ID No.11] forwardTTTGAATTCAGTAGGTGCACTGTTTGTGAC [SEQ ID No.12] revserse

[0675] b. Preparation of the R. reniformis luciferase-encoding DNA

[0676] The initial source of the R. reniformis luciferase gene wasplasmid pLXSN-RUC. Vector pLXSN [see, e.g., U.S. Pat. Nos. 5,324,655,5,470,730, 5,468,634, 5,358,866 and Miller et al. (1989) Biotechniques7:980] is a retroviral vector capable of expressing heterologous DNAunder the transcriptional control of the retroviral LTR; it alsocontains the neomycin-resistance gene operatively linked for expressionto the SV40 early region promoter. The R. reniformis luciferase gene wasobtained from plasmid pTZrLuc-1 [see, e.g., U.S. Pat. No. 5,292,658; seealso the Genbank Sequence Database accession no. M63501; and see alsoLorenz et al. (1991) Proc. Natl. Acad. Sci. U.S.A. 88:4438-4442] and isshown as SEQ ID NO. 10. The 0.97 kb EcoRI/SmaI fragment of pTZrLuc-1contains the coding region of the Renilla luciferase-encodig DNA. VectorpLXSN was digested with and ligated with the luciferase gene containedon a pLXSN-RUC, which contains the luciferase gene located operablylinked to the viral LTR and upstream of the SV40 promoter, which directsexpression of the neomycin-resistance gene.

[0677] C. Fusion of DNA Encoding the IL-2 Signal Peptide and the R.reniformis Luciferase Gene to Yield pLXSN-ILRUC

[0678] The pGEMT vector containing the IL-2 signal peptide-encoding DNAdescribed in 1.a. above was digested with EcoRI, and the resultingfragment encoding the signal peptide was ligated to EcoRI-digestedpLXSN-RUC. The resulting plasmid, called pLXSN-ILRUC, contains the IL-2signal peptide-encoding DNA located immediately upstream of the R.reniformis gene in pLXSN-RUC. Plasmid pLXSN-ILRUC was then used as atemplate for nucleic acid amplification of the fusion gene in order toadd a SmaI site at the 3′ end of the fusion gene. The amplificationproduct was subcloned into linearized [EcoRI/SmaI-digested] pGEMT[Promega] to generate ILRUC-pGEMT.

[0679] d. Introduction of the Fusion Gene into a Vector ContainingControl Elements for Expression in Mammalian Cells

[0680] Plasmid ILRUC-pGEMT was digested with KspI and SmaI to release afragment containing the IL-2 signal peptide-luciferase fusion gene whichwas ligated to HpaI-digested pLNCX. Vector pLNCX [see, e.g., U.S. Pat.Nos. 5,324,655 and 5,457,182; see, also Miller and Rosman (1989)Biotechniques 7:980-990] is a retroviral vector for expressingheterologous DNA under the control of the CMV promoter; it also containsthe neomycin-resistance gene under the transcriptional control of aviral promoter. The vector resulting from the ligation reaction wasdesignated pLNCX-ILRUC. Vector pLNCX-ILRUC contains the IL-2 signalpeptide-luciferase fusion gene located immediately downstream of the CMVpromoter and upstream of the viral 3′ LTR and polyadenylation signal inpLNCX. This arrangement provides for expression of the fusion gene underthe control of the CMV promoter. Placement of the heterologousprotein-encoding DNA [i.e., the luciferase gene] in operative linkagewith the IL-2 signal peptide-encoding DNA provides for expression of thefusion in mammalian cells transfected with the vector such that theheterologous protein is secreted from the host cell into theextracellular medium.

[0681] 2. Construction of Protein Secretion Targeting VectorpLNCX-ILRUCλ

[0682] Vector pLNCX-ILRUC may be modified so that it can be used tointroduce the IL-2 signal peptide-luciferase fusion gene into amammalian artificial chromosome in a host cell. To facilitate specificincorporation of the pLNCX-ILRUC expression vector into a mammalianartificial chromosome, nucleic acid sequences that are homologous tonucleotides present in the artificial chromosome are added to the vectorto permit site directed recombination.

[0683] Exemplary artificial chromosomes described herein contain λ phageDNA. Therefore, protein secretion targeting vector pLNCX-ILRUCλ wasprepared by addition of λ phage DNA [from Charon 4A arms] to produce thesecretion vector pLNCX-ILRUC.

[0684] 3. Expression and Secretion of R. reniformis Luciferase fromMammalian Cells

[0685] a. Expression of R. reniformis Luciferase Using pLNCX-ILRUC

[0686] Mammalian cells [LMTK⁻ from the ATCC] were transientlytransfected with vector pLNCX-ILRUC [˜10 μg] by electroporation [BIORAD,performed according to the manufacturer's instructions]. Stabletransfectants produced by growth in G418 for neo selection have alsobeen prepared.

[0687] Transfectants were grown and then analyzed for expression ofluciferase. To determine whether active luciferase was secreted from thetransfected cells, culture media were assayed for luciferase by additionof coelentrazine [see, e.g., Matthews et al. (1977) Biochemistry16:85-91].

[0688] The results of these assays establish that vector pLNCX-ILRUC iscapable of providing constitutive expression of heterologous DNA inmammalian host cells. Furthermore, the results demonstrate that thehuman IL-2 signal peptide is capable of directing secretion of proteinsfused to the C-terminus of the peptide. Additionally, these datademonstrate that the R. reniformis luciferase protein is a highlyeffective reporter molecule, which is stable in a mammalian cellenvironment, and forms the basis of a sensitive, facile assay for geneexpression.

[0689] b. Renilla reniformis Luciferase Appears to be Secreted fromLMTK⁻ Cells.

[0690] (i) Renilla luciferase assay of cell pellets

[0691] The following cells were tested:

[0692] cells with no vector: LMTK⁻ cells without vector as a negativecontrol;

[0693] cells transfected with pLNCX only;

[0694] cells transfected with RUC-pLNCX [Renilla luciferase gene inpLNCX vector];

[0695] cells transfected with pLNCX-ILRUC [vector containing the IL-2leader sequence+Renilla luciferase fusion gene in pLNCX vector].

[0696] Forty-eight hours after electroporation, the cells and culturemedium were collected. The cell pellet from 4 plates of cells wasresuspended in 1 ml assay buffer and was lysed by sonication. Twohundred μl of the resuspended cell pellet was used for each assay forluciferase activity [see, e.g., Matthews et al. (1977) Biochemistry16:85-91]. The assay was repeated three times and the averagebioluminescence measurement was obtained.

[0697] The results showed that there was relatively low backgroundbioluminescence in the cells transformed with pLNCX or the negativecontrol cells; there was a low level observed in the cell pellet fromcells containing the vector with the IL-2 leader sequence-luciferasegene fusion and more than 5000 RLU in the sample from cells containingRUC-pLNCX.

[0698] (ii) Renilla Luciferase Assay of Cell Medium

[0699] Forty milliliters of medium from 4 plates of cells were harvestedand spun down. Two hundred microliters of medium was used for eachluciferase activity assay. The assay was repeated several times and theaverage bioluminescence measurement was obtained. These results showedthat a relatively high level of bioluminescence was detected in the cellmedium from cells transformed with pLNCX-ILRUC; about 10-fold lowerlevels [slightly above the background levels in medium from cells withno vector or transfected with pLNCX only] was detected in the cellstransfected with RUC-pLNCX.

[0700] (iii) Conclusions

[0701] The results of these experiments demonstrated that Renillaluciferase appears to be secreted from LMTK⁻ cells under the directionof the IL-2 signal peptide. The medium from cells transfected withRenilla luciferase-encoding DNA linked to the DNA encoding the IL-2secretion signal had substantially higher levels of Renilla luciferaseactivity than controls or cells containing luciferase-encoding DNAwithout the signal peptide-encoding DNA. Also, the differences betweenthe controls and cells containing luciferase encoding-DNA demonstratethat the luciferase activity is specifically from luciferase, not from anon-specific reaction. In addition, the results from the medium ofRUC-pLNCX transfected cells, which is similar to background, show thatthe luciferase activity in the medium does not come from cell lysis, butfrom secreted luciferase.

[0702] c. Expression of R. reniformis Luciferase Using pLNCX-ILRUCλ

[0703] To express the IL-2 signal peptide-R. reniformis fusion gene froman mammalian artificial chromosome, vector pLNCX-ILRUCλ is targeted forsite-specific integration into a mammalian artificial chromosome throughhomologous recombination of the λ DNA sequences contained in thechromosome and the vector. This is accomplished by introduction ofpLNCX-ILRUCλ into either a fusion cell line harboring mammalianartificial chromosomes or mammalian host cells that contain mammalianartificial chromosomes. If the vector is introduced into a fusion cellline harboring the artificial chromosomes, for example throughmicroinjection of the vector or transfection of the fusion cell linewith the vector, the cells are then grown under selective conditions.The artificial chromosomes, which have incorporated vector pLNCX-ILRUCλ,are isolated from the surviving cells, using purification procedures asdescribed above, and then injected into the mammalian host cells.

[0704] Alternatively, the mammalian host cells may first be injectedwith mammalian artificial chromosomes which have been isolated from afusion cell line. The host cells are then transfected with vectorpLNCX-ILRUCλ and grown.

[0705] The recombinant host cells are then assayed for luciferaseexpression as described above.

[0706] F. Other Targeting Vectors

[0707] These vectors, which are based on vector pMCT-RUC, rely onpositive and negative selection to insure insertion and selection forthe double recombinants. A single crossover results in incorporation ofthe DT-A, which kills the cell, double crossover recombinations deletethe DT-1 gene.

[0708] 1. Plasmid pNEM1 contains:

[0709] DT-A: Diphtheria toxin gene (negative selectable marker)

[0710] Hyg: Hygromycin gene (positive selectable marker)

[0711] ruc: Renilla luciferase gene (non-selectable marker)

[0712] 1: LTR-MMTV promoter

[0713] 2: TK promoter

[0714] 3: CMV promoter

[0715] MMR: Homology region (plasmid pAG60)

[0716] 2. plasmid pNEM-2 and -3 are similar to pNEM 1 except fordifferent negative selectable markers:

[0717] pNEM-1: diphtheria toxin gene as “-” selectable marker

[0718] pNEM-2: hygromycin antisense gene as “-” selectable marker

[0719] pNEM-3: thymidine kinase HSV-1 gene as “-” selectable marker

[0720] 3. Plasmid—λ DNA based homology:

[0721] pNEMλ-1: base vector

[0722] pNEMλ-2: base vector containing p5=gene

[0723] 1: LTR MMTV promoter

[0724] 2: SV40 promoter

[0725] 3: CMV promoter

[0726] 4: μTIIA promoter (metallothionein gene promoter)

[0727] —homology region (plasmid pAG60)

[0728] λ L.A. and λ R.A. homology regions for λ left and right arms (λgt-WES).

EXAMPLE 13

[0729] Microinjection of Mammalian Cells with Plasmid DNA

[0730] These procedures will be used to microinject MACs into eukaryoticcells, including mammalian and insect cells.

[0731] The microinjection technique is based on the use of small glasscapillaries as a delivery system into cells and has been used forintroduction of DNA fragments into nuclei [see, e.g., Chalfie et al.(1994) Science 263:802-804]. It allows the transfer of almost any typeof molecules, e.g., hormones, proteins, DNA and RNA, into either thecytoplasm or nuclei of recipient cells This technique has no cell typerestriction and is more efficient than other methods, includingCa²⁺-mediated gene transfer and liposome-mediated gene transfer. About20-30% of the injected cells become successfully transformed.

[0732] Microinjection is performed under a phase-contrast microscope. Aglass microcapillary, prefilled with the DNA sample, is directed into acell to be injected with the aid of a micromanipulator. An appropriatesample volume (1-10 pl) is transferred into the cell by gentle airpressure exerted by a transjector connected to the capillary. Recipientcells are grown on glass slides imprinted with numbered squares forconvenient localization of the injected cells.

[0733] a. Materials and Equipment

[0734] Nunclon tissue culture dishes 35×10 mm, mouse cell line EC317C5Plasmid DNA pCH110 [Pharmacia], Purified Green Florescent Protein (GFP)[GFPs from Aequorea and Renilla have been purified and also DNA encodingGFPs has been cloned; see, e.g., Prasher et al. (1992) Gene 111:229-233;International PCT Application No. WO 95/07463, which is based on U.S.application Ser. No. 08/119,678 and U.S. application Ser. No.08/192,274], ZEISS Axiovert 100 microscope, Eppendorf transjector 5246,Eppendorf micromanipulator 5171, Eppendorf Cellocate coverslips,Eppendorf microloaders, Eppendorf femtotips and other standardequipment.

[0735] b. Protocol for Injecting

[0736] (1) Fibroblast cells are grown in 35 mm tissue culture dishes(37° C., 5% CO₂) until the cell density reaches 80% confluency. Thedishes are removed from the incubator and medium is added to about a 5mm depth.

[0737] (2) The dish is placed onto the dish holder and the cellsobserved with 10×objective; the focus is desirably above the cellsurface.

[0738] (3) Plasmid or chromosomal DNA solution [1 ng/μl] and GFP proteinsolution are further purified by centrifuging the DNA sample at a forcesufficient to remove any particular debris [typically about 10,000 rpmfor 10 minutes in a microcentrifuge].

[0739] (4) Two 2 μl of the DNA solution (1 ng/μl) is loaded into amicrocapillary with an Eppendorf microloader. During loading, the loaderis inserted to the tip end of the microcapillary. GFP (1 mg/ml) isloaded with the same procedure.

[0740] (5) The protecting sheath is removed from the microcapillary andthe microcapillary is fixed onto the capillary holder connected with themicromanipulator.

[0741] (6) The capillary tip is lowered to the surface of the medium andis focussed on the cells gradually until the tip of the capillaryreaches the surface of a cell. The capillary is lowered further so thatthe it is inserted into the cell. Various parameters, such as the levelof the capillary, the time and pressure, are determined for theparticular equipment. For example, using the fibroblast cell line C5 andthe above-noted equipment, the best conditions are: injection time 0.4second, pressure 80 psi. DNA can then be automatically injected into thenuclei of the cells.

[0742] (7) After injection, the cells are returned to the incubator, andincubated for about 18-24 hours.

[0743] (8) After incubation the number of transformants can bedetermined by a suitable method, which depends upon the selectionmarker. For example, if green fluorescent protein is used, the assay canbe performed using UV light source and fluorescent filter set at 0-24hours after injection. If β-gal-containing DNA, such as DNA-derived frompHC110, has been injected, then the transformants can be assayed forβ-gal.

[0744] (c) Detection of β-galactosidase in Cells Injected with PlasmidDNA

[0745] The medium is removed from the culture plate and the cells arefixed by addition of 5 ml of fixation Solution I: (1% glutaraldehyde;0.1 M sodium phosphate buffer, pH 7.0; 1 mM MgCl₂), and incubated for 15minutes at 37° C. Fixation Solution I is replaced with 5 ml of X-galSolution II: [0.2% X-gal, 10 mM sodium phosphate buffer (pH 7.0), 150 mMNaCl, 1 mM MgCl₂, 3.3 mM K₄Fe(CN)₆H₂O, 3.3 mM K₃Fe(CN)₆], and the platesare incubated for 30-60 minutes at 37° C. The X-gal solution is removedand 2 ml of 70% glycerol is added to each dish. Blue stained cells areidentified under a light microscope.

[0746] This method will be used to introduce a MAC, particularly the MACwith the anti-HIV megachromosome, to produce a mouse model for anti-HIVactivity.

EXAMPLE 14

[0747] Transgenic (non-human) Animals

[0748] Transgenic (non-human) animals can be generated that expressheterologous genes which confer desired traits, e.g, disease resistance,in the animals. A transgenic mouse is prepared to serve as a model of adisease-resistant animal. Genes that encode vaccines or that encodetherapeutic molecules can be introduced into embryos or ES cells toproduce animals that express the gene product and thereby are resistantto or less susceptible to a particular disorder.

[0749] The mammalian artificial megachromosome and others of theartificial chromosomes, particularly the SATACs, can be used to generatetransgenic (non-human) animals, including mammals and birds, that stablyexpress genes conferring desired traits, such as genes conferringresistance to pathogenic viruses. The artificial chromosomes can also beused to produce transgenic (non-human) animals, such as pigs, that canproduce immunologically humanized organs for xenotransplantation.

[0750] For example, transgenic mice containing a transgene encoding ananti-HIV ribozyme provide a useful model for the development of stabletransgenic (non-human) animals using these methods. The artificialchromosomes can be used to produce transgenic (non-human) animals,particularly, cows, goats, mice, oxen, camels, pigs and sheep, thatproduce the proteins of interest in their milk; and to producetransgenic chickens and other egg-producing fowl, that producetherapeutic proteins or other proteins of interest in their eggs. Forexample, use of mammary gland-specific promoters for expression ofheterologous DNA in milk is known [see, e.g. U.S. Pat. No. 4,873,316].In particular, a milk-specific promoter or a promoter, preferably linkedto a milk-specific signal peptide, specifically activated in mammarytissue is operatively linked to the DNA of interest, thereby providingexpression of that DNA sequence in milk.

[0751] 1.Development of Control Transgenic Mice Expressing Anti-HIVRibozyme

[0752] Control transgenic mice are generated in order to comparestability and amounts of transgene expression in mice developed usingtransgene DNA carried on a vector (control mice) with expression in micedeveloped using transgenes carried in an artificial megachromosome.

[0753] a. Development of Control Transgenic Mice Expressingβ-galactosidase

[0754] One set of control transgenic mice was generated bymicroinjection of mouse embryos with the β-galactosidase gene alone. Themicroinjection procedure used to introduce the plasmid DNA into themouse embryos is as described in Example 13, but modified for use withembryos [see, e.g., Hogan et al. (1994) Manipulating the Mouse Embryo, A:Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y., see, especially pages 255-264 and Appendix 3]. Fertilizedmouse embryos [Strain CB6 obtained from Charles River Co.] were injectedwith 1 ng of plasmid pCH110 (Pharmacia) which had been linearized bydigestion with BamHI. This plasmid contains the β-galactosidase genelinked to the SV40 late promoter. The β-galactosidase gene productprovides a readily detectable marker for successful transgeneexpression. Furthermore, these control mice provide confirmation of themicroinjection procedure used to introduce the plasmid into the embryos.Additionally, because the megachromosome that is transferred to themouse embryos in the model system (see below) also contains theβ-galactosidase gene, the control transgenic mice that have beengenerated by injection of pCH110 into embryos serve as an analogoussystem for comparison of heterologous gene expression from a plasmidversus from a gene carried on an artifical chromosome.

[0755] After injection, the embryos are cultured in modified HTF mediumunder 5% CO₂ at 37° C. for one day until they divide to form two cells.The two-cell embryos are then implanted into surrogate mother femalemice [for procedures see, Manipulating the Mouse Embryo, A LaboratoryManual (1994) Hogan et al., eds., Cold Spring Harbor Laboratory Press,Cold Spring Harbor, N.Y., pp. 127 et seq.].

[0756] b. Development of Control Transgenic Mice Expressing Anti-HIVRibozyme

[0757] One set of anti-HIV ribozyme gene-containing control transgenicmice was generated by microinjection of mouse embryos with plasmidpCEPUR-132 which contains three different genes: (1) DNA encoding ananti-HIV ribozyme, (2) the puromycin-resistance gene and (3) thehygromycin-resistance gene. Plasmid pCEPUR-132 was constructed byligating portions of plasmid pCEP-132 containing the anti-HIV ribozymegene (referred to as ribozyme D by Chang et al. [(1990) Clin. Biotech.2:23-311; see also U.S. Pat. No. 5,144,019 to Rossi et al..,particularly FIG. 4 of the patent) and the hygromycin-resistance genewith a portion of plasmid pCEPUR containing the puromycin-resistancegene.

[0758] Plasmid pCEP-132 was constructed as follows. Vector pCEP4(Invitrogen, San Diego, Calif.; see also Yates et al. (1985) Nature313:812-815) was digested with XhoI which cleaves in the multiplecloning site region of the vector. This ˜10.4-kb vector contains thehygromycin-resistance gene linked to the thymidine kinase gene promoterand polyadenylation signal, as well as the ampicillin-resistance geneand CoIE1origin of replication and EBNA-1 (Epstein-Barr virus nuclearantigen) genes and OriP. The multiple cloning site is flanked by thecytomegalovirus promoter and SV40 polyadenylation signal.

[0759] XhoI-digested pCEP4 was ligated with a fragment obtained bydigestion of plasmid 132 (see Example 4 for a description of thisplasmid) with XhoI and SalI. This XhoI/SalI fragment contains theanti-HIV ribozyme gene linked at the 3′ end to the SV40 polyadenylationsignal. The plasmid resulting from this ligation was designatedpCEP-132. Thus, in effect, pCEP-132 comprises pCEP4 with the anti-HIVribozyme gene and SV40 polyadenylation signal inserted in the multiplecloning site for CMV promoter-driven expression of the anti-HIV ribozymegene.

[0760] To generate pCEPUR-132, pCEP-132 was ligated with a fragment ofpCEPUR. pCEPUR was prepared by ligating a 7.7-kb fragment generated uponNheI/NruI digestion of pCEP4 with a 1.1-kb NheI/SnaBI fragment of pBabe[see Morgenstern and Land (1990) Nucleic Acids Res. 18:3587-3596 for adescription of pBabe] that contains the puromycin-resistance gene linkedat the 5′ end to the SV40 promoter. Thus, pCEPUR is made up of theampicillin-resistance and EBNA1 genes, as well as the CoIE1 and OriPelements from pCEP4 and the puromycin-resistance gene from pBabe. Thepuromycin-resistance gene in pCEPUR is flanked by the SV40 promoter(from pBabe) at the 5′ end and the SV40 polyadenylation signal (frompCEP4) at the 3′ end.

[0761] Plasmid pCEPUR was digested with XhoI and SalI and the fragmentcontaining the puromycin-resistance gene linked at the 5′ end to theSV40 promoter was ligated with XhoI-digested pCEP-132 to yield the˜12.1-kb plasmid designated pCEPUR-132. Thus, pCEPUR-132, in effect,comprises pCEP-132 with puromycin-resistance gene and SV40 promoterinserted at the XhoI site. The main elements of pCEPUR-132 are thehygromycin-resistance gene linked to the thymidine kinase promoter andpolyadenylation signal, the anti-HIV ribozyme gene linked to the CMVpromoter and SV40 polyadenylation signal, and the puromycin-resistancegene linked to the SV40 promoter and polyadenylation signal. The plasmidalso contains the ampicillin-resistance and EBNA1 genes and the CoIE1origin of replication and OriP.

[0762] Zygotes were prepared from (C57BL/6JxCBA/J) F1 female mice [see,e.g., Manipulating the Mouse Embryo, A Laboratory Manual (1994) Hogan etal., eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor,N.Y., p. 429], which had been previously mated with a (C57BL/6JxCBA/J)F1 male. The male pronuclei of these F2 zygotes were injected [see,Manipulating the Mouse Embryo, A Laboratory Manual (1994) Hogan et al.,eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.]with pCEPUR-132 (˜3 μg/ml), which had been linearized by digestion withNruI. The injected eggs were then implanted in surrogate mother femalemice for development into transgenic offspring.

[0763] These primary carrier offspring were analyzed (as describedbelow) for the presence of the transgene in DNA isolated from tailcells. Seven carrier mice that contained transgenes in their tail cells(but that may not carry the transgene in all their cells, i.e., they maybe chimeric) were allowed to mate to produce non-chimeric or germ-lineheterozygotes. The heterozygotes were, in turn, crossed to generatehomozygote transgenic offspring.

[0764] 2.Development of Model Transgenic Mice Using Mammalian ArtificialChromosomes

[0765] Fertilized mouse embryos are microinjected (as described above)with megachromosomes (1-10 pL containing 0-1 chromosomes/pL) isolatedfrom fusion cell line G3D5 or H1D3 (described above). Themegachromosomes are isolated as described herein. Megachromosomesisolated from either cell line carry the anti-HIV ribozyme (ribozyme D)gene as well as the hygromycin-resistance and β-galactosidase genes. Theinjected embryos are then developed into transgenic mice as describedabove.

[0766] Alternatively, the megachromosome-containing cell line G3D5* orH1D3* is fused with mouse embryonic stem cells [see, e.g., U.S. Pat. No.5,453,357, commerically available; see Manipulating the Mouse Embryo, ALaboratory Manual (1994) Hogan et al., eds., Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y., pages 253-289] followingstandard procedures see also, e.g., Guide to Techniques in MouseDevelopment in Methods in Enzymology Vol. 25, Wassarman and DePamphilis, eds. (1993), pages 803-932]. (It is also possible to deliverisolated megachromosomes into embryonic stem cells using the Microcell.procedure [such as that described above].) The stem cells are culturedin the presence of a fibroblast [e.g., STO fibroblasts that areresistant to hygromycin and puromycin] Cells of the resultant fusioncell line, which contains megachromosomes carrying the transgenes [i.e.,anti-HIV ribozyme, hygromycin-resistance and β-galactosidase genes], arethen transplanted into mouse blastocysts, which are in turn implantedinto a surrogate mother female mouse where development into a transgenicmouse will occur.

[0767] Mice generated by this method are chimeric; the transgenes willbe expressed in only certain areas of the mouse, e.g., the head, andthus may not be expressed in all cells.

[0768] 3. Analysis of Transgenic Mice for Transgene Expression

[0769] Beginning when the transgenic mice, generated as described above,are three-to-four weeks old, they can be analyzed for stable expressionof the transgenes that were transferred into the embryos [or fertilizedeggs] from which they develop. The transgenic mice may be analyzed inseveral ways as follows.

[0770] a. Analysis of Cells Obtained from the Transgenic Mice

[0771] Cell samples [e.g., spleen, liver and kidney cells, lymphocytes,tail cells] are obtained from the transgenic mice. Any cells may betested for transgene expression. If, however, the mice are chimerasgenerated by microinjection of fertilized eggs or by fusion of embryonicstem cells with megachromosome-containing cells, only cells from areasof the mouse that carry the transgene are expected to express thetransgene. If the cells survive growth on hygromycin [or hygromycin andpuromycin or neomycin, if the cells are obtained from mice generated bytransfer of both antibiotic-resistance genes], this is one indicationthat they are stably expressing the transgenes. RNA isolated from thecells according to standard methods may also be analyzed by northernblot procedures to determine if the cells express transcripts thathybridize to nucleic acid probes based on the antibiotic-resistancegenes. Additionally, cells obtained from the transgenic mice may also beanalyzed for β-galactosidase expression using standard assays for thismarker enzyme [for example, by direct staining of the product of areaction involving β-galactosidase and the X-gal substrate, see, e.g.,Jones (1986) EMBO 5:3133-3142, or by measurement of β-galactosidaseactivity, see, e.g, Miller (1972) in Experiments in Molecular Geneticspp. 352-355, Cold Spring Harbor Press]. Analysis of β-galactosidaseexpression is particularly used to evaluate transgene expression incells obtained from control transgenic mice in which the only transgenetransferred into the embryo was the β-galactosidase gene.

[0772] Stable expression of the anti-HIV ribozyme gene in cells obtainedfrom the transgenic mice may be evaluated in several ways. First, DNAisolated from the cells according to standard procedures may besubjected to nucleic acid amplification using primers corresponding tothe ribozyme gene sequence. If the gene is contained within the cells,an amplified product of pre-determined size is detected uponhybridization of the reaction mixture to a nucleic acid probe based onthe ribozyme gene sequence. Furthermore, DNA isolated from the cells maybe analyzed using Southern blot methods for hybridization to such anucleic acid probe. Second, RNA isolated from the cells may be subjectedto northern blot hybridization to determine if the cells express RNAthat hybridizes to nucleic acid probes based on the ribozyme gene.Third, the cells may be analyzed for the presence of anti-HIV ribozymeactivity as described, for example, in Chang et al. (1990) Clin.Biotech. 2:23-31. In this analysis, RNA isolated from the cells is mixedwith radioactively labeled HIV gag target RNA which can be obtained byin vitro transcription of gag gene template under reaction conditionsfavorable to in vitro cleavage of the gag target, such as thosedescribed in Chang et al. (1990) Clin. Biotech. 2:23-31. After thereaction has been stopped, the mixture is analyzed by gelelectrophoresis to determine if cleavage products smaller in size thanthe whole template are detected; presence of such cleavage fragments isindicative of the presence of stably expressed ribozyme.

[0773] b. Analysis of Whole Transgenic Mice

[0774] Whole transgenic mice that have been generated by transfer of theanti-HIV ribozyme gene [as well as selection and marker genes] intoembryos or fertilized eggs can additionally be analyzed for transgeneexpression by challenging the mice with infection with HIV. It ispossible for mice to be infected with HIV upon intraperitoneal injectionwith high-producing HIV-infected U937 cells [see, e.g., Locardi et al.(1992) J. Virol. 66:1649-1654]. Successful infection may be confirmed byanalysis of DNA isolated from cells, such as peripheral bloodmononuclear cells, obtained from transgenic mice that have been injectedwith HIV-infected human cells. The DNA of infected transgenic mice cellswill contain HIV-specific gag and env sequences, as demonstrated by, forexample, nucleic acid amplification using HIV-specific primers. If thecells also stably express the anti-HIV ribozyme, then analysis of RNAextracts of the cells should reveal the smaller gag fragments arising bycleavage of the gag transcript by the ribozyme.

[0775] Additionally, the transgenic mice carrying the anti-HIV ribozymegene can be crossed with transgenic mice expressing human CD4 (i.e., thecellular receptor for HIV) [see Gillespie et al. (1993) Mol. Cell. Biol.13:2952-2958; Hanna et al. (1994) Mol. Cell. Biol. 14:1084-1094; andYeung et al. (1994) J. Exp. Med. 180:1911-1920, for a description oftransgenic mice expressing human CD4]. The offspring of these crossedtransgenic mice expressing both the CD4 and anti-HIV ribozyme transgenesshould be more resistant to infection [as a result of a reduction in thelevels of active HIV in the cells] than mice expressing CD4 alone[without expressing anti-HIV ribozyme].

[0776] 4.Development of Transgenic Chickens using Artificial Chromosomes

[0777] The development of transgenic chickens has many applications inthe improvement of domestic poultry, an agricultural species ofcommercial significance, such as disease resistance genes and genesencoding therapeutic proteins. It appears that efforts in the area ofchicken transgenesis have been hampered due to difficulty in achievingstable expression of transgenes in chicken cells using conventionalmethods of gene transfer via random introduction into recipient cells.Artificial chromosomes are, therefore, particularly useful in thedevelopment of transgenic chickens because they provide for stablemaintenance of transgenes in host cells.

[0778] a. Preparation of Artificial Chromosomes for Introduction ofTransgenes into Recipient Chicken Cells

[0779] (i) Mammalian artificial chromosomes

[0780] Mammalian artificial chromosomes, such as the SATACs andminichromosomes described herein, can be modified to incorporatedetectable reporter genes and/or transgenes of interest for use indeveloping transgenic chickens. Alternatively, chicken-specificartifical chromosomes can be constructed using the methods herein. Inparticular, chicken artificial chromosomes [CACs] can be prepared usingthe methods herein for preparing MACs; or, as described above, thechicken librarires can be introduced into MACs provided herein and theresulting MACs introduced into chicken cells and those that arefunctional in chicken cells selected.

[0781] As described in Examples 4 and 7, and elsewhere herein,artificial chromosome-containing mouse LMTK⁻-derived cell lines, orminichromosome-containing cell lines, as well as hybrids thereof, can betransfected with selected DNA to generate MACs [or CACs] that haveintegrated the foreign DNA for functional expression of heterologousgenes contained within the DNA.

[0782] To generate MACs or CACs containing transgenes to be expressed inchicken cells, the MAC-containing cell lines may be transfected with DNAthat includes λ DNA and transgenes of interest operably linked to apromoter that is capable of driving expression of genes in chickencells. Alternatively, the minichromosomes or MACs [or CACs], produced asdescribed above, can be isolated and introduced into cells, followed bytargeted integration of selected DNA. Vectors for targeted integrationare provided herein or can be constructed as described herein.

[0783] Promoters of interest include constitutive, inducible and tissue(or cell)-specific promoters known to those of skill in the art topromote expression of genes in chicken cells. For example, expression ofthe lacZ gene in chicken blastodermal cells and primary chickenfibroblasts has been demonstrated using a mouse heat-shock protein 68(hsp 68) promoter [phspPTlacZpA; see Brazolot et al. (1991) Mol. Reprod.Devel. 30:304-312], a Zn²⁺-inducible chicken metallothionein (cMt)promoter [pCBcMtlacZ; see Brazolot et al. (1991) Mol. Reprod. Devel.30:304-312], the constitutive Rous sarcoma virus and chicken β-actinpromoters in tandem [pmiwZ; see Brazolot et al. (1991) Mol. Reprod.Devel. 30:304-312] and the constitutive cytomegalovirus (CMV) promoter.Of particular interest herein are egg-specific promoters that arederived from genes, such as ovalbumin and lysozyme, that are expressedin eggs.

[0784] The choice of promoter will depend on a variety of factors,including, for example, whether the transgene product is to be expressedthroughout the transgenic chicken or restricted to certain locations,such as the egg. Cell-specific promoters functional in chickens includethe steroid-responsive promoter of the egg ovalbumin protein-encodinggene [see Gaub et al. (1987) EMBO J. 6:2313-2320; Tora et al. (1988)EMBO J. 7:3771-3778; Park et al. (1995) Biochem. Mol. Biol. Int.(Australia) 36:811-816].

[0785] (ii) Chicken Artificial Chromosomes

[0786] Additionally, chicken artificial chromosomes may be generatedusing methods described herein. For example, chicken cells, such asprimary chicken fibroblasts [see Brazolot et al. (1991) Mol. Reprod.Devel. 30:304-312], may be transfected with DNA that encodes aselectable marker [such as a protein that confers resistance toantibiotics] and that includes DNA (such as chicken satellite DNA) thattargets the introduced DNA to the pericentric region of the endogenouschicken chromosomes. Transfectants that survive growth on selectionmedium are then analyzed, using methods described herein, for thepresence of artificial chromosomes, including minichromosomes, andparticularly SATACs. An artificial chromosome-containing transfectantcell line may then be transfected with DNA encoding the transgene ofinterest [fused to an appropriate promoter] along with DNA that targetsthe foreign DNA to the chicken artificial chromosome.

[0787] b. Introduction of Artificial Chromosomes Carrying Transgenes ofInterest into Recipient Chicken Cells

[0788] Cell lines containing artificial chromosomes that harbortransgene(s) of interest (i.e., donor cells) may be fused with recipientchicken cells in order to transfer the chromosomes into the recipientcells. Alternatively, the artificial chromosomes may be isolated fromthe donor cells, for example, using methods described herein [see, e.g.,Example 10], and directly introduced into recipient cells.

[0789] Exemplary chicken recipient cell lines include, but are notlimited to, stage X blastoderm cells [see, e.g., Brazolot et al. (1991)Mol. Reprod. Dev. 30:304-312; Etches et al. (1993) Poultry Sci.72:882-889; Petitte et al. (1990) Development 108:185-189] and chickzygotes [see, e.g., Love et al. (1994) Biotechnology 12:60-63].

[0790] For example, microcell fusion is one method for introduction ofartificial chromosomes into avian cells [see, e.g., Dieken et al.[(1996) Nature Genet. 12:174-182 for methods of fusing microcells withDT40 chicken pre-B cells]. In this method, microcells are prepared [forexample, using procedures described in Example 1.A.5] from theartificial chromosome-containing cell lines and fused with chickenrecipient cells.

[0791] Isolated artificial chromosomes may be directly introduced intochicken recipient cell lines through, for example, lipid-mediatedcarrier systems, such as lipofection procedures [see, e.g, Brazolot etal. (1991) Mol. Reprod. Dev. 30:304-31 2] or direct microinjection.Microinjection is generally preferred for introduction of the artificialchromosomes into chicken zygotes [see, e.g, Love et al. (1994)Biotechnology 12:60-63].

[0792] C. Development of Transgenic Chickens

[0793] Transgenic chickens may be developed by injecting recipient StageX blastoderm cells (which have received the artificial chromosomes) intoembryos at a similar stage of development [see, e.g., Etches et al.(1993) Poultry Sci. 72:882-889; Petitte et al. (1990) Development108:185-189; and Carsience et al. (1993) Development 117: 669-675]. Therecipient chicken embryos within the shell are candled and allowed tohatch to yield a germline chimeric chicken that will express thetransgene(s) in some of its cells.

[0794] Alternatively, the artificial chromosomes may be introduced intochick zygotes, for example through direct microinjection [see, e.g.,Love et al. (1994) Biotechnology 12:60-63], which thereby areincorporated into at least a portion of the cells in the chicken.Inclusion of a tissue-specific promoter, such an an egg-specificpromoter, will ensure appropriate expression of operatively-linkedheterologous DNA.

[0795] The DNA of interest may also be introduced into a minichromosome,by methods provided herein. The minichromosome may either be oneprovided herein, or one generated in chicken cells using the methodsherein. The heterologous DNA will be introduced using a targetingvector, such as those provided herein, or constructed as providedherein.

[0796] Since modifications will be apparent to those of skill in thisart, it is intended that this invention be limited only by the scope ofthe appended claims.

0 SEQUENCE LISTING (1) GENERAL INFORMATION: (iii) NUMBER OF SEQUENCES:34 (2) INFORMATION FOR SEQ ID NO: 1: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 1293 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:GAATTCATCA TTTTTCANGT CCTCAAGTGG ATGTTTCTCA TTTNCCATGA TTTTAAGTTT 60TCTCGCCATA TTCCTGGTCC TACAGTGTGC ATTTCTCCAT TTTNCACGTT TTNCAGTGAT 120TTCGTCATTT TCAAGTCCTC AAGTGGATGT TTCTCATTTN CCATGAATTT CAGTTTTCTN 180GCCATATTCC ACGTCCTACA GNGGACATTT CTAAATTTNC CACCTTTTTC AGTTTTCCTC 240GCCATATTTC ACGTCCTAAA ATGTGTATTT CTCGTTTNCC GTGATTTTCA GTTTTCTCGC 300CAGATTCCAG GTCCTATAAT GTGCATTTCT CATTTNNCAC GTTTTTCAGT GATTTCGTCA 360TTTTTTCAAG TCGGCAAGTG GATGTTTCTC ATTTNCCATG ATTTNCAGTT TTCTTGNAAT 420ATTCCATGTC CTACAATGAT CATTTTTAAT TTTCCACCTT TTCATTTTTC CACGCCATAT 480TTCATGTCCT AAAGTGTATA TTTCTCCTTT TCCGCGATTT TCAGTTTTCT CGCCATATTC 540CAGGTCCTAC AGTGTGCATT CCTCATTTTT CACCTTTTTC ACTGATTTCG TCATTTTTCA 600AGTCGTCAAC TGGATCTTTC TAATTTTCCA TGATTTTCAG TTATCTTGTC ATATTCCATG 660TCCTACAGTG GACATTTCTA AATTTTCCAA CTTTTTCAAT TTTTCTCGAC ATATTTGACG 720TGCTAAAGTG TGTATTTCTT ATTTTCCGTG ATTTTCAGTT TTCTCGCCAT ATTCCAGGCG 780CTAATAGTGT GCATTTCTCA TTTTTCACGT TTTTCAGTGA TTTCGTCATT TTTTCCAGTT 840GTCAAGGGGA TGTTTCTCAT TTTCCATGAG TGTCAGTTTT CTTGCTATAT TCCATGTCCT 900ACAGTGACAT TTCTAAATAT TATACCTTTT TCAGTTTTTC TCACCATATT TCACGTCCTA 960AAGTATATAT TTCTCATTTT CCCTGATTTT CAGTTTCCTT GCCATATTCC AGGTCCTACA 1020GTGTGCATTT CTCATTTTTC ACGTTTTTCA GTAATTTCTT CATTTTTTAA GCCCTCAAAT 1080GGATGTTTCT CATTTTCCAT GATTTTCAGT TTTCTTGCCA TATACCATGT CCTACAGTGG 1140ACATTTCTAA ATTATCCACC TTTTTCAGTT TTTCATCGGC ACATTTCACG TCCTAAAGTG 1200TGTATTTCTA ATTTTCAGTG ATTTTCAGTT TTCTCGCCAT ATTCCAGGAC CTACAGTGTG 1260CATTTCTCAT TTTTCACGTT TTTCAGTGAA TTC 1293 (2) INFORMATION FOR SEQ ID NO:2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1044 base pairs (B) TYPE:nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULETYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v)FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi)SEQUENCE DESCRIPTION: SEQ ID NO: 2: AGGCCTATGG TGAAAAAGGA AATATCTTCCCCTGAAAACT AGACAGAAGG ATTCTCAGAA 60 TCTTATTTGT GATGTGCGCC CCTCAACTAACAGTGTTGAA GCTTTCTTTT GATAGAGCAG 120 TTTTGAAACA CTCTTTTTGT AAAATCTGCAAGAGGATATT TGGATAGCTT TGAGGATTTC 180 CGTTGGAAAC GGGATTGTCT TCATATAAACCCTAGACAGA AGCATTCTCA GAAGCTTCAT 240 TGGGATGTTT CAGTTGAAGT CACAGTGTTGAACAGTCCCC TTTCATAGAG CAGGTTTGAA 300 ACACTCTTTT TTGTAGTATC TGGAAGTGGACATTTGGAGC GATCTCAGGA CTGCGGTGAA 360 AAAGGAAATA TCTTCCAATA AAAGCTAGATAGAGGCAATG TCAGAAACCT TTTTCATGAT 420 GTATCTACTC AGCTAACAGA GTTGAACCTTCCTTTGAGAG AGCAGTTTTG AAACACTCTT 480 TTTGTGGAAT CTGCAAGTGG ATATTTGTCTAGCTTTGAGG ATTTCGTTGG GAAACGGGAT 540 TACATATAAA AAGCAGACAG CAGCATTCCCAGAAACTTCT TTGTGATGTT TGCATTCAAG 600 TCACAGAGTT GAACATTCCC TTTCATAGAGCAGGTTTGAA ACACACTTTT TGATGTATCT 660 GGATGTGGAC ATTTGCAGCG CTTTCAGGCCTAAGGTGAAA AGGAAATATC TTCCCCTGAA 720 AACTAGACAG AAGCATTCTC AGAAACTTATTTGTGATGTG CGCCCTCAAC TAACAGTGAA 780 GAAGCTTTCT TTTGATAGAG GCAGTTTTGAAACACTCTTT TGTGGAATCT GCAAGTGGAT 840 ATTTGTCTAG CTTTGAGGAT TTCTTTGGAAACGGGATTAC ATATAAAAAG CAGACAGCAG 900 CATTCCCAGA ATCTTGTTTG TGATGTTTGCATTCAAGTCA CAGAGTTGAA CATTCCCTTT 960 CAGAGAGCAG GTTTGAACAC TCTTTTTATAGTATCTGGAT GTGGACATTT GGAGCGCTTT 1020 CAGGGGGGAT CCTCTAGAAT TCCT 1044(2) INFORMATION FOR SEQ ID NO: 3: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 2492 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:CTGCAGCTGG GGGTCTCCAA TCAGGCAGGG GCCCCTTACT ACTCAGATGG GGTGGCCGAG 60TAGGGGAAGG GGGTGCAGGC TGCATGAGTG GACACAGCTG TAGGACTACC TGGGGGCTGT 120GGATCTATGG GGGTGGGGAG AAGCCCAGTG ACAGTGCCTA GAAGAGACAA GGTGGCCTGA 180GAGGGTCTGA GGAACATAGA GCTGGCCATG TTGGGGCCAG GTCTCAAGCA GGAAGTGAGG 240AATGGGACAG GCTTGAGGAT ACTCTACTCA GTAGCCAGGA TAGCAAGGAG GGCTTGGGGT 300TGCTATCCTG GGGTTCAACC CCCCAGGTTG AAGGCCCTGG GGGAGATGGT CCCAGGACAT 360ATTACAATGG ACACAGGAGG TTGGGACACC TGGAGTCACC AAACAAAACC ATGCCAAGAG 420AGACCATGAG TAGGGGTGTC CAGTCCAGCC CTCTGACTGA GCTGCATTGT TCAAATCCAA 480AGGGCCCCTG CTGCCACCTA GTGGCTGATG GCATCCACAT GACCCTGGGC CACACGCGTT 540TAGGGTCTCT GTGAAGACCA AGATCCTTGT TACATTGAAC GACTCCTAAA TGAGCAGAGA 600TTTCCACCTA TTCGAAACAA TCACATAAAA TCCATCCTGG AAAAAGCCTG GGGGATGGCA 660CTAAGGCTAG GGATAGGGTG GGATGAAGAT TATAGTTACA GTAAGGGGTT TAGGGTTAGG 720GATCAACGTT GGTTAGGAGT TAGGGATACA GTAGGGTACC GGTAGGGTTA GGGGTTAGGG 780TTAGGGGTTA GGGTTAGGGT TAGGGTTAGG GTTAGGGTTA GGGGTTAGGG GTTAGGGTTA 840GGGTTAGGTT TTGGGGTGGC GTATTTTGGT CTTATACGCT GTGTTCCACT GGCAATGAAA 900AGAGTTCTTG TTTTTCCTTC AGCAATTTGT CATTTTTAAA AGAGTTTAGC AATTCTAACA 960GATATAGACC AGCTGTGCTA TCTCATTGTG GTTTTCAATT GTAACCACAT TGTGGTTTCA 1020ATGTGTTTAC TTGCCATCTG TAGATCTTCT TTGCGTGAGG TGTCTGTTCA GATGTGTGTG 1080CATTTCTTGN NTTTNGGCTG TTTAACTTAT TGTTTAGTTT TAATAATTTT TTATATATTT 1140GAAGACAAAT CTTTCTCAGA TGTGTATTTG CAAATATTTC TTCAATATGA GGCTTGCTTT 1200TGTCTCTAAC AAGGTCTCTT CAGAGATAAC TTAAATATAA GAAATCCACA CTGTCACTTC 1260TTTTGTGTAT ATCTACCTTT TGTGTCATTT GTTAAAATTC ATTACCAAAC CCAAAGGCAG 1320ATAGCTTTTC TTCTATTGTT TCTTCTAGAA ATTTGTATAG TTTTGCATTT TTAGTGTAAG 1380GATGATTTTG AGTGATTATT TGTGTAAGTT GTAAAGTTTT CGTCTATATC CATATCATTT 1440CTTATGGTTT CCAATTAATC GTTCCCTCAC TATTTTTGGG AAAGACACAG GATAGTGGGC 1500TTTGTTAGAG TAGATAGGTA GCTAGACATG AACAGGAGGG GGCCTCCTGG AAAAGGGAAA 1560GTCTGGGAAG GCTCACCTGG AGGACCACCA AAAATTCACA TATTAGTAGC ATCTCTAGTG 1620CTGGAGTGGA TGGGCACTTG TCAATTGTGG GTAGGAGGGA AAAGAGGTCC TATGCAGAAA 1680GAAACTCCCT AGAACTCCTC TGAAGATGCC CCAATCATTC ACTCTGCAAT AAAAATGTCA 1740GAATATTGCT AGCTACATGC TGATAAGGNN AAAGGGGACA TTCTTAAGTG AAACCTGGCA 1800CCATAAGTAC AGATTAGGGC AGAGAAGGAC ATTCAAAAGA GGCAGGCGCA GTAGGTACAA 1860ACGTGATCGC TGTCAGTGTG CCTGGGATGG CGGGAAGGAG GCTGGTGCCA GAGTGGATTC 1920GTATTGATCA CCACACATAT ACCTCAACCA ACAGTGAGGA GGTCCCACAA GCCTAAGTGG 1980GGCAAGTTGG GGAGCTAAGG CAGTAGCAGG AAAACCAGAC AAAGAAAACA GGTGGAGACT 2040TGAGACAGAG GCAGGAATGT GAAGAAATCC AAAATAAAAT TCCCTGCACA GGACTCTTAG 2100GCTGTTTAAT GCATCGCTCA GTCCCACTCC TCCCTATTTT TCTACAATAA ACTCTTTACA 2160CTGTGTTTCT TTTCAATGAA GTTATCTGCC ATCTTTGTAT TGCCTCTTGG TGAAAATGTT 2220TCTTCCAAGT TAAACAAGAA CTGGGACATC AGCTCTCCCC AGTAATAGCT CCGTTTCAGT 2280TTGAATTTAC AGAACTGATG GGCTTAATAA CTGGCGCTCT GACTTTAGTG GTGCAGGAGG 2340CCGTCACACC GGGACCAAGA GTGCCCTGCC TAGTCCCCAT CTGCCCGCAG GTGGCGGCTG 2400CCTCGACACT GACAGCAATA GGGTCCGGCA GTGTCCCCAG CTGCCAGCAG GGGGCGTACG 2460ACGACTACAC TGTGAGCAAG AGGGCCCTGC AG 2492 (2) INFORMATION FOR SEQ ID NO:4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 28 base pairs (B) TYPE:nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULETYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v)FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi)SEQUENCE DESCRIPTION: SEQ ID NO: 4: GGGGAATTCA TTGGGATGTT TCAGTTGA 28(2) INFORMATION FOR SEQ ID NO: 5: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 29 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:CGAAAGTCCC CCCTAGGAGA TCTTAAGGA 29 (2) INFORMATION FOR SEQ ID NO: 6: (i)SEQUENCE CHARACTERISTICS: (A) LENGTH: 47 base pairs (B) TYPE: nucleicacid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE:RNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION:SEQ ID NO: 6: CCGCTTAATA CTCTGATGAG TCCGTGAGGA CGAAACGCTC TCGCACC 47 (2)INFORMATION FOR SEQ ID NO: 7: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:25 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: CGATTTAAATTAATTAAGCC CGGGC 25 (2) INFORMATION FOR SEQ ID NO: 8: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 27 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION:SEQ ID NO: 8: TAAATTTAAT TAATTCGGGC CCGTCGA 27 (2) INFORMATION FOR SEQID NO: 9: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 69 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: ATGTAC AGG ATG CAA CTC CTG TCT TGC ATT GCA CTA AGT CTT GCA CTT 48 Met TyrArg Met Gln Leu Leu Ser Cys Ile Ala Leu Ser Leu Ala Leu 1 5 10 15 GTCACA AAC AGT GCA CCT ACT 69 Val Thr Asn Ser Ala Pro Thr 20 (2)INFORMATION FOR SEQ ID NO: 10: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:945 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (vi) ORIGINAL SOURCE: (ix)FEATURE: (A) NAME/KEY: Coding Sequence (B) LOCATION: 1...942 (D) OTHERINFORMATION: Renilla Reinformis Luciferase (x) PUBLICATION INFORMATION:(H) DOCUMENT NUMBER: 5,418,155 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:AGC TTA AAG ATG ACT TCG AAA GTT TAT GAT CCA GAA CAA AGG AAA CGG 48 SerLeu Lys Met Thr Ser Lys Val Tyr Asp Pro Glu Gln Arg Lys Arg 1 5 10 15ATG ATA ACT GGT CCG CAG TGG TGG GCC AGA TGT AAA CAA ATG AAT GTT 96 MetIle Thr Gly Pro Gln Trp Trp Ala Arg Cys Lys Gln Met Asn Val 20 25 30 CTTGAT TCA TTT ATT AAT TAT TAT GAT TCA GAA AAA CAT GCA GAA AAT 144 Leu AspSer Phe Ile Asn Tyr Tyr Asp Ser Glu Lys His Ala Glu Asn 35 40 45 GCT GTTATT TTT TTA CAT GGT AAC GCG GCC TCT TCT TAT TTA TGG CGA 192 Ala Val IlePhe Leu His Gly Asn Ala Ala Ser Ser Tyr Leu Trp Arg 50 55 60 CAT GTT GTGCCA CAT ATT GAG CCA GTA GCG CGG TGT ATT ATA CCA GAT 240 His Val Val ProHis Ile Glu Pro Val Ala Arg Cys Ile Ile Pro Asp 65 70 75 80 CTT ATT GGTATG GGC AAA TCA GGC AAA TCT GGT AAT GGT TCT TAT AGG 288 Leu Ile Gly MetGly Lys Ser Gly Lys Ser Gly Asn Gly Ser Tyr Arg 85 90 95 TTA CTT GAT CATTAC AAA TAT CTT ACT GCA TGG TTG AAC TTC TTA ATT 336 Leu Leu Asp His TyrLys Tyr Leu Thr Ala Trp Leu Asn Phe Leu Ile 100 105 110 TAC CAA AGA AGATCA TTT TTT GTC GGC CAT GAT TGG GGT GCT TGT TTG 384 Tyr Gln Arg Arg SerPhe Phe Val Gly His Asp Trp Gly Ala Cys Leu 115 120 125 GCA TTT CAT TATAGC TAT GAG CAT CAA GAT AAG ATC AAA GCA ATA GTT 432 Ala Phe His Tyr SerTyr Glu His Gln Asp Lys Ile Lys Ala Ile Val 130 135 140 CAC GCT GAA AGTGTA GTA GAT GTG ATT GAA TCA TGG GAT GAA TGG CCT 480 His Ala Glu Ser ValVal Asp Val Ile Glu Ser Trp Asp Glu Trp Pro 145 150 155 160 GAT ATT GAAGAA GAT ATT GCG TTG ATC AAA TCT GAA GAA GGA GAA AAA 528 Asp Ile Glu GluAsp Ile Ala Leu Ile Lys Ser Glu Glu Gly Glu Lys 165 170 175 ATG GTT TTGGAG AAT AAC TTC TTC GTG GAA ACC ATG TTG CCA TCA AAA 576 Met Val Leu GluAsn Asn Phe Phe Val Glu Thr Met Leu Pro Ser Lys 180 185 190 ATC ATG AGAAAG TTA GAA CCA GAA GAA TTT GCA GCA TAT CTT GAA CCA 624 Ile Met Arg LysLeu Glu Pro Glu Glu Phe Ala Ala Tyr Leu Glu Pro 195 200 205 TTC AAA GAGAAA GGT GAA GTT CGT CGT CCA ACA TTA TCA TGG CCT CGT 672 Phe Lys Glu LysGly Glu Val Arg Arg Pro Thr Leu Ser Trp Pro Arg 210 215 220 GAA ATC CCGTTA GTA AAA GGT GGT AAA CCT GAC GTT GTA CAA ATT GTT 720 Glu Ile Pro LeuVal Lys Gly Gly Lys Pro Asp Val Val Gln Ile Val 225 230 235 240 AGG AATTAT AAT GCT TAT CTA CGT GCA AGT GAT GAT TTA CCA AAA ATG 768 Arg Asn TyrAsn Ala Tyr Leu Arg Ala Ser Asp Asp Leu Pro Lys Met 245 250 255 TTT ATTGAA TCG GAT CCA GGA TTC TTT TCC AAT GCT ATT GTT GAA GGC 816 Phe Ile GluSer Asp Pro Gly Phe Phe Ser Asn Ala Ile Val Glu Gly 260 265 270 GCC AAGAAG TTT CCT AAT ACT GAA TTT GTC AAA GTA AAA GGT CTT CAT 864 Ala Lys LysPhe Pro Asn Thr Glu Phe Val Lys Val Lys Gly Leu His 275 280 285 TTT TCGCAA GAA GAT GCA CCT GAT GAA ATG GGA AAA TAT ATC AAA TCG 912 Phe Ser GlnGlu Asp Ala Pro Asp Glu Met Gly Lys Tyr Ile Lys Ser 290 295 300 TTC GTTGAG CGA GTT CTC AAA AAT GAA CAA TAA 945 Phe Val Glu Arg Val Leu Lys AsnGlu Gln 305 310 (2) INFORMATION FOR SEQ ID NO: 11: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 30 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi) SEQUENCE DESCRIPTION:SEQ ID NO: 11: TTTGAATTC A TGTACAGGAT GCAACTCCTG 30 (2) INFORMATION FORSEQ ID NO: 12: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 30 base pairs(B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear(ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE:NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (ix) FEATURE: (xi)SEQUENCE DESCRIPTION: SEQ ID NO: 12: TTTGAATTCA GTAGGTGCAC TGTTTGTCAC 30(2) INFORMATION FOR SEQ ID NO: 13: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 1434 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: CCTCCACGCA CGTTGTGATATGTAGATGAT AATCATTATC AGAGCAGCGT TGGGGGATAA 60 TGTCGACATT TCCACTCCCAATGACGGTGA TGTATAATGC TCAAGTATTC TCCTGCTTTT 120 TTACCACTAA CTAGGAACTGGGTTTGGCCT TAATTCAGAC AGCCTTGGCT CTGTCTGGAC 180 AGGTCCAGAC GACTGACACCATTAACACTT TGTCAGCCTC AGTGACTACA GTCATAGATG 240 AACAGGCCTC AGCTAATGTCAAGATACAGA GAGGTCTCAT GCTGGTTAAT CAACTCATAG 300 ATCTTGTCCA GATACAACTAGATGTATTAT GACAAATAAC TCAGCAGGGA TGTGAACAAA 360 AGTTTCCGGG ATTGTGTGTTATTTCCATTC AGTATGTTAA ATTTACTAGG ACAGCTAATT 420 TGTCAAAAAG TCTTTTTCAGTATATGTTAC AGAATTGGAT GGCTGAATTT GAACAGATCC 480 TTCGGGAATT GAGACTTCAGGTCAACTCCA CGCGCTTGGA CCTGTCGCTG ACCAAAGGAT 540 TACCCAATTG GATCTCCTCAGCATTTTCTT TCTTTAAAAA ATGGGTGGGA TTAATATTAT 600 TTGGAGATAC ACTTTGCTGTGGATTAGTGT TGCTTCTTTG ATTGGTCTGT AAGCTTAAGG 660 CCCAAACTAG GAGAGACAAGGTGGTTATTG CCCAGGCGCT TGCAGGACTA GAACATGGAG 720 CTTCCCCTGA TATATGGTTATCTATGCTTA GGCAATAGGT CGCTGGCCAC TCAGCTCTTA 780 TATCCCACGA GGCTAGTCTCATTGTACGGG ATAGAGTGAG TGTGCTTCAG CAGCCCGAGA 840 GAGTTGCAAG GCTAAGCACTGCAATGGAAA GGCTCTGCGG CATATATGTG CCTATTCTAG 900 GGGGACATGT CATCTTTCATGAAGGTTCAG TGTCCTAGTT CCCTTCCCCC AGGCAAAACG 960 ACACGGGAGC AGGTCAGGGTTGCTCTGGGT AAAAGCCTGT GAGCCTGGGA GCTAATCCTG 1020 TACATGGCTC CTTTACCTACACACTGGGGA TTTGACCTCT ATCTCCACTC TCATTAATAT 1080 GGGTGGCCTA TTTGCTCTTATTAAAAGGAA AGGGGGAGAT GTTGGGAGCC GCGCCCACTG 1140 TCGCCGTTAC AAGATGGCGCTGACAGCTGT GTTCTAAGTG GTAAACAAAT AATCTGCGGG 1200 TGTGCCGAGG GTGGTTCTTCACTCCATGTG CTCTGCCTTC CCCGTGACGT CAACTCGAGA 1260 GATGGGCTGC AGCCAATCAGGGAGTGACAC GTCCTAGGCG AAGGAGAATT CTCCTTATCG 1320 GGGACGGGGT TTCGTTCTCTCTCTCTCTCT TGCTTCTCTC TCTTGCTTTT TCGCTCTACA 1380 GCTTCCCGTA AAGTGATAATGATTATCATC TACATATCAC AACGTGCGTG GAGG 1434 (2) INFORMATION FOR SEQ IDNO: 14: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1400 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 14: CCTCCACGCA CGTTGTGATA TGTAGATGAT AATCATTATCAGAGCAGCGT TGGGGGATAA 60 TGTCGACATT TCCACTCCCA ATGACGGTGA TGTATAATGCTCAAGTATTC TCCTGCTTTT 120 TTACCACTAA CTAGGAACTG GGTTTGGCCT TAATTCAGACAGCCTTGGCT CTGTCTGGAC 180 AGGTCCAGAT ACAACTAGAT GTATTATGAC AAATAACTCAGCAGGGATGT GAACAAAAGT 240 TTCCGGGATT GCGTGTTATT TCCATCCAGT ATGTTAAATTTACTAGGGCA GCTAATTTGT 300 CAAAAAGTCT TTTCCAGTAT ATGTTACAGA ATTGGATGGCTGAATTTGAA CAGATCCTTC 360 GGGAATTGAG ACTTCAGGTC AACTCCACGC GCTTGGACCTGTCCCTGACC AAAGGATTAC 420 CCAATTGGAT CTCCTCAGCA TTTTCTTTCT TTAAAAAATGGGTGGGATTA ATATTATTTG 480 GAGATACACT TTGCTGTGGA TTAGTGTTGC TTCTTTGATTGGTCTGTAAG CTTAAGGCCC 540 AAACTAGGAG AGACAAGGTG GTTATTGCCC AGGCGCTTGCAGGACTAGAA CATGGAGCTT 600 CCCCTGATAT ATCTATGCTT AGGCAATAGG TCGCTGGCCACTCAGCTCTT ATATCCCATG 660 AGGCTAGTCT CATTGCACGG GATAGAGTGA GTGTGCTTCAGCAGCCCGAG AGAGTTGCAC 720 GGCTAAGCAC TGCAATGGAA AGGCTCTGCG GCATATATGAGCCTATTCTA GGGAGACATG 780 TCATCTTTCA AGAAGGTTGA GTGTCCAAGT GTCCTTCCTCCAGGCAAAAC GACACGGGAG 840 CAGGTCAGGG TTGCTCTGGG TAAAAGCCTG TGAGCCTAAGAGCTAATCCT GTACATGGCT 900 CCTTTACCTA CACACTGGGG ATTTGACCTC TATCTCCACTCTCATTAATA TGGGTGGCCT 960 ATTTGCTCTT ATTAAAAGGA AAGGGGGAGA TGTTGGGAGCCGCGCCCACA TTCGCCGTTA 1020 CAAGATGGCG CTGACAGCTG TGTTCTAAGT GGTAAACAAATAATCTGCGC ATGCGCCGAG 1080 GGTGGTTCTT CACTCCATGT GCTCTGCCTT CCCCGTGACGTCAACTCGGC CGATGGGCTG 1140 CAGTCAATCA GGGAGTGACA CGTCCTAGGC GAAGGAAAATTCTCCTTAAT AGGGACGGGG 1200 TTTCGTTTTC TCTCTCTCTT GCTTCGCTCT CTCTTGCTTCTTGCTCTCTT TTCCTGAAGA 1260 TGTAAGAATA AAGCTTTGCC GCAGAAGATT CTGGTCTGTGGTGTTCTTCC TGGCCGGTCG 1320 TGAGAACGCG TCTAATAACA ATTGGTGCCG AAACCCGGGTGATAATGATT ATCATCTACA 1380 TATCACAACG TGCGTGGAGG 1400 (2) INFORMATIONFOR SEQ ID NO: 15: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1369 basepairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY:linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv)ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi)SEQUENCE DESCRIPTION: SEQ ID NO: 15: CCTCCACGCA CGTTGTGATA TGTAGATGATAATCATTATC ACTTTACGGG TCCTTTCACT 60 ACAACTGCCA CGAGGCCCCG TGCTCTGGTAATAGATCTTT GCTGAAAAGG CACACACATG 120 ACACATTACT CAAGGTGGGC TCATCTGAGCTGCAGATTCA GCTTAATATG AATCTTGCCA 180 ATTGTGTGAA ATCATAAATC TTCAAAGTGACACTCATTGC CAGACACAGG TGCCCACCTT 240 TGGCATAATA AACAAACACA AATTATCTATTATATAAAGG GTGTTAGAAG ATGCTTTAGA 300 ATACAAATAA ATCATGGTAG ATAACAGTAAGTTGAGAGCT TAAATTTAAT AAAGTGATAT 360 ACCTAATAAA AATTAAATTA AGAAGGTGTGAATATACTAC AGTAGGTAAA TTATTTCATT 420 AATTTATTTT CTTTCTTAAT CCTTTATAATGTTTTCTGCT ATTGTCAATT GCACATCCAT 480 ATGTTCAATT CTTCACTGTA ATGAAGAAATGTAGTAAATA TACTTTCCGA ACAAGTTGTA 540 TCAAATATGT TACACTTGAT TCCGTGTGTTACTTATCATT TTATTATTAT ATTGATTGCA 600 TTCCTTCGTT ACTTGATATT ATTACAAGGTACATATTTAT TCTCTCAGAT CTTCATTATA 660 CTCTAACCAT TTTATAACAT ACTTTATTTATTCATTTCTT ATGTGTGCTG TGAGGCACAA 720 ATGCCAGAGA GAACTTGAGC AGATAAGAGGACAAATTGCA AGAGTCAGTT ACCTCCTGCT 780 GTTCCTTGGA AACTCAGGAT CAAATTCAGGTTGTCAGGCT TGGCAGCATG CACTTTTTAC 840 CAGTGCCTCC ATCTTGCTAG CCCTGAACATCAAGCTTTGC AGACAGACAG GCTACACTAA 900 GTGAACTGGT CATTCACAGC ATGCATGGTGATTTATTGTT ACTTTCTATT CCATGCCTTT 960 ACTATTTCTA CTAGGTGCTA GCTAGTACTGTATTTCGAGA TAGAAGTTAC TGAAAGAAAA 1020 TTACATTGTT TTCTATAGAT CCTTGATACTCTTTCAGCAG ATATAGAGTT TTAATCAGGT 1080 CCTAGACCCT TTCTTCACTC TTATTAAATACTAAGTACAA ATTAAGTTTA TCCAAAACAG 1140 TACGGATGTT GATTTTGTGC AGTTCTACTATGATAATAGT CTAGCTTCAT AAATCTGACA 1200 CACTTATTGG GAATGTTTTT GTTAATAAAAGATTCAGGTG TTACTCTAGG TCAAGAGAAT 1260 ATTAAACATC AGTCCCAAAT TACAAACTTCAATAAAAGAT TTGACTCTCC AGTGGTGGCA 1320 ATATAAAGTG ATAATGATTA TCATCTACATATCACAACGT GCGTGGAGG 1369 (2) INFORMATION FOR SEQ ID NO: 16: (i)SEQUENCE CHARACTERISTICS: (A) LENGTH: 22118 base pairs (B) TYPE: nucleicacid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE:Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENTTYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ IDNO: 16: GAATTCCCCT ATCCCTAATC CAGATTGGTG GAATAACTTG GTATAGATGTTTGTGCATTA 60 AAAACCCTGT AGGATCTTCA CTCTAGGTCA CTGTTCAGCA CTGGAACCTGAATTGTGGCC 120 CTGAGTGATA GGTCCTGGGA CATATGCAGT TCTGCACAGA CAGACAGACAGACAGACAGA 180 CAGACAGACA GACAGACGTT ACAAACAAAC ACGTTGAGCC GTGTGCCAACACACACACAA 240 ACACCACTCT GGCCATAATT ATTGAGGACG TTGATTTATT ATTCTGTGTTTGTGAGTCTG 300 TCTGTCTGTC TGTCTGTCTG TCTGTCTGTC TATCAAACCA AAAGAAACCAAACAATTATG 360 CCTGCCTGCC TGCCTGCCTG CCTACACAGA GAAATGATTT CTTCAATCAATCTAAAACGA 420 CCTCCTAAGT TTGCCTTTTT TCTCTTTCTT TATCTTTTTC TTTTTTCTTTTCTTCTTCCT 480 TCCTTCCTTC CTTCCTTCCT TCCTTCCTTT CTTTCTTTCT TTCTTTCTTTCTTACTTTCT 540 TTCTTTCCTT CTTACATTTA TTCTTTTCAT ACATAGTTTC TTAGTGTAAGCATCCCTGAC 600 TGTCTTGAAG ACACTTTGTA GGCCTCAATC CTGTAAGAGC CTTCCTCTGCTTTTCAAATG 660 CTGGCATGAA TGTTGTACCT CACTATGACC AGCTTAGTCT TCAAGTCTGAGTTACTGGAA 720 AGGAGTTCCA AGAAGACTGG TTATATTTTT CATTTATTAT TGCATTTTAATTAAAATTTA 780 ATTTCACCAA AAGAATTTAG ACTGACCAAT TCAGAGTCTG CCGTTTAAAAGCATAAGGAA 840 AAAGTAGGAG AAAAACGTGA GGCTGTCTGT GGATGGTCGA GGCTGCTTTAGGGAGCCTCG 900 TCACCATTCT GCACTTGCAA ACCGGGCCAC TAGAACCCGG TGAAGGGAGAAACCAAAGCC 960 ACCTGGAAAC AATAGGTCAC ATGAAGGCCA GCCACCTCCA TCTTGTTGTGCGGGAGTTCA 1020 GTTAGCAGAC AAGATGGCTG CCATGCACAT GTTGTCTTTC AGCTTGGTGAGGTCAAAGTA 1080 CAACCGAGTC ACAGAACAAG GAAGTATACA CAGTGAGTTC CAGGTCAGCCAGAGTTTACA 1140 CAGAGAAACC ACATCTTGAA AAAAACAAAA AAATAAATTA AATAAATATAATTTAAAAAT 1200 TTAAAAATAG CCGGGAGTGA TGGCGCATGT CTTTAATCCC AGCTCTCTTCAGGCAGAGAT 1260 GGGAGGATTT CTGAGTTTGA GGCCAGCCTG GTCTGCAAAG TGAGTTCCAGGACAGTCAGG 1320 GCTATACAGA GAAACCCTGT CTTGAAAACT AAACTAAATT AAACTAAACTAAACTAAAAA 1380 AATATAAAAT AAAAATTTTA AAGAATTTTA AAAAACTACA GAAATCAAACATAAGCCCAC 1440 GAGATGGCAA GTAACTGCAA TCATAGCAGA AATATTATAC ACACACACACACACAGACTC 1500 TGTCATAAAA TCCAATGTGC CTTCATGATG ATCAAATTTC GATAGTCAGTAATACTAGAA 1560 GAATCATATG TCTGAAAATA AAAGCCAGAA CCTTTTCTGC TTTTGTTTTCTTTTGCCCCA 1620 AGATAGGGTT TCTCTCAGTG TATCCCTGGC ATCCCTGCCT GGAACTTCCTTTGTAGGTTT 1680 GGTAGCCTCA AACTCAGAGA GGTCCTCTCT GCCTGCCTGC CTGCCTGCCTGCCTGCCTGC 1740 CTGCCTGCCT GCCTGCCTCA CTTCTTCTGC CACCCACACA ACCGAGTCGAACCTAGGATC 1800 TTTATTTCTT TCTCTTTCTC TCTTCTTTCT TTCTTTCTTT CTTTCTTTCTTTCTTTCTTT 1860 CTTTCTTTCT TTCTTATTCA ATTAGTTTTC AATGTAAGTG TGTGTTTGTGCTCTATCTGC 1920 TGCCTATAGG CCTGCTTGCC AGGAGAGGGC AACAGAACCT AGGAGAAACCACCATGCAGC 1980 TCCTGAGAAT AAGTGAAAAA ACAACAAAAA AAGGAAATTC TAATCACATAGAATGTAGAT 2040 ATATGCCGAG GCTGTCAGAG TGCTTTTTAA GGCTTAGTGT AAGTAATGAAAATTGTTGTG 2100 TGTCTTTTAT CCAAACACAG AAGAGAGGTG GCTCGGCCTG CATGTCTGTTGTCTGCATGT 2160 AGACCAGGCT GGCCTTGAAC ACATTAATCT GTCTGCCTCT GCTTCCCTAATGCTGCGATT 2220 AAAGGCATGT GCCACCACTG CCCGGACTGA TTTCTTCTTT TTTTTTTTTTTGGAAAATAC 2280 CTTTCTTTCT TTTTCTCTCT CTCTTTCTTC CTTCCTTCCT TTCTTTCTATTCTTTTTTTC 2340 TTTCTTTTTT CTTTTTTTTT TTTTTTTTAA AATTTGCCTA AGGTTAAAGGTGTGCTCCAC 2400 AATTGCCTCA GCTCTGCTCT AATTCTCTTT AAAAAAAAAC AAACAAAAAAAAAACCAAAA 2460 CAGTATGTAT GTATGTATAT TTAGAAGAAA TACTAATCCA TTAATAACTCTTTTTTCCTA 2520 AAATTCATGT CATTCTTGTT CCACAAAGTG AGTTCCAGGA CTTACCAGAGAAACCCTGTG 2580 TTCAAATTTC TGTGTTCAAG GTCACCCTGG CTTACAAAGT GAGTTCCAAGTCCGATAGGG 2640 CTACACAGAA AAACCATATC TCAGAAAAAA AAAAAGTTCC AAACACACACACACACACAC 2700 ACACACACAC ACACACACAC ACACACACAC ACACACACAG CGCGCCGCGGCGATGAGGGG 2760 AAGTCGTGCC TAAAATAAAT ATTTTTCTGG CCAAAGTGAA AGCAAATCACTATGAAGAGG 2820 TACTCCTAGA AAAAATAAAT ACAAACGGGC TTTTTAATCA TTCCAGCACTGTTTTAATTT 2880 AACTCTGAAT TTAGTCTTGG AAAAGGGGGC GGGTGTGGGT GAGTGAGGGCGAGCGAGCAG 2940 ACGGGCGGGC GGGCGGGTGA GTGGCCGGCG GCGGTGGCAG CGAGCACCAGAAAACAACAA 3000 ACCCCAAGCG GTAGAGTGTT TTAAAAATGA GACCTAAATG TGGTGGAACGGAGGTCGCCG 3060 CCACCCTCCT CTTCCACTGC TTAGATGCTC CCTTCCCCTT ACTGTGCTCCCTTCCCCTAA 3120 CTGTGCCTAA CTGTGCCTGT TCCCTCACCC CGCTGATTCG CCAGCGACGTACTTTGACTT 3180 CAAGAACGAT TTTGCCTGTT TTCACCGCTC CCTGTCATAC TTTCGTTTTTGGGTGCCCGA 3240 GTCTAGCCCG TTCGCTATGT TCGGGCGGGA CGATGGGGAC CGTTTGTGCCACTCGGGAGA 3300 AGTGGTGGGT GGGTACGCTG CTCCGTCGTG CGTGCGTGAG TGCCGGAACCTGAGCTCGGG 3360 AGACCCTCCG GAGAGACAGA ATGAGTGAGT GAATGTGGCG GCGCGTGACGGATCTGTATT 3420 GGTTTGTATG GTTGATCGAG ACCATTGTCG GGCGACACCT AGTGGTGACAAGTTTCGGGA 3480 ACGCTCCAGG CCTCTCAGGT TGGTGACACA GGAGAGGGAA GTGCCTGTGGTGAGGCGACC 3540 AGGGTGACAG GAGGCCGGGC AAGCAGGCGG GAGCGTCTCG GAGATGGTGTCGTGTTTAAG 3600 GACGGTCTCT AACAAGGAGG TCGTACAGGG AGATGGCCAA AGCAGACCGAGTTGCTGTAC 3660 GCCCTTTTGG GAAAAATGCT AGGGTTGGTG GCAACGTTAC TAGGTCGACCAGAAGGCTTA 3720 AGTCCTACCC CCCCCCCCCT TTTTTTTTTT TTTCCTCCAG AAGCCCTCTCTTGTCCCCGT 3780 CACCGGGGGC ACCGTACATC TGAGGCCGAG AGGACGCGAT GGGCCCGGCTTCCAAGCCGG 3840 TGTGGCTCGG CCAGCTGGCG CTTCGGGTCT TTTTTTTTTT TTTTTTTTTTTTTTCCTCCA 3900 GAAGCCTTGT CTGTCGCTGT CACCGGGGGC GCTGTACTTC TGAGGCCGAGAGGACGCGAT 3960 GGGCCCCGGC TTCCAAGCCG GTGTGGCTCG GCCAGCTGGA GCTTCGGGTCTTTTTTTTTT 4020 TTTTTTTTTT TTTTTTTCTC CAGAAGCCTT GTCTGTCGCT GTCACCGGGGGCGCTGTACT 4080 TCTGAGGCCG AGAGGACGCG ATGGGTCGGC TTCCAAGCCG ATGTGGCGGGGCCAGCTGGA 4140 GCTTCGGGTT TTTTTTTTTC CTCCAGAAGC CCTCTCTTGT CCCCGTCACCGGGGGCGCTG 4200 TACTTCTGAG GCCGAGAGGA CGTGATGGGC CCGGGTTCCA GGCGGATGTCGCCCGGTCAG 4260 CTGGAGCTTT GGATCTTTTT TTTTTTTTTT CCTCCAGAAG CCCTCTCTTGTCCCCGTCAC 4320 CGGGGGCACC TTACATCTGA GGGCGAGAGG ACGTGATGGG TCCGGCTTCCAAGCCGATGT 4380 GGCGGGGCCA GCTGGAGCTT CGGGTTTTTT TTTTTTCCTC CAGAAGCCCTCTCTTGTCCC 4440 CGTCACCGGG GGCGCTGTAC TTCTGAGGCC GAGAGGACGT GATGGGCCCGGGTTCCAGGC 4500 GGATGTCGCC CGGTCAGCTG GAGCTTTGGA TCATTTTTTT TTTTCCCTCCAGAAGCCCTC 4560 TCTTGTCCCC GTCACCGGGG GCACCGTACA TCTGAGGCCG AGAGGACACGATGGGCCTGT 4620 CTTCCAAGCC GATGTGGCCC GGCCAGCTGG AGCTTCGGGT CTTTTTTTTTTTTTTTCCTC 4680 CAGAAGCCTT GTCTGTCGCT GTCACCCGGG GCGCTGTACT TCTGAGGCCGAGAGGACGCG 4740 ATGGGCCCGG CTTCCAAGCC GGTGTGGCTC GGCCAGCTGG AGCTTCGGGTCTTTTTTTTT 4800 TTTTTTTTTT TTCCTCCAGA AACCTTGTCT GTCGCTGTCA CCCGGGGCGCTTGTACTTCT 4860 GATGCCGAGA GGACGCGATG GGCCCGTCTT CCAGGCCGAT GTGGCCCGGTCAGCTGGAGC 4920 TTTGGATCTT TTTTTTTTTT TTTTCCTCCA GAAGCCCTCT CTTGTCCCCGTCACCGGGGG 4980 CACCTTACAT CTGAGGCCTA GAGGACACGA TGGGCCCGGG TTCCAGGCCGATGTGGCCCG 5040 GTCAGCTGGA GCTTTGGATC TTTTTTTTTT TTTTCTTCCA GAAGCCCTCTTGTCCCCGTC 5100 ACCGGTGGCA CTGTACATCT GAGGCGGAGA GGACATTATG GGCCCGGCTTCCAATCCGAT 5160 GTGGCCCGGT CAGCTGGAGC TTTGGATCTT ATTTTTTTTT TAATTTTTTCTTCCAGAAGC 5220 CCTCTTGTCC CTGTCACCGG TGGCACGGTA CATCTGAGGC CGAGAGGACATTATGGGCCC 5280 GGCTTCCAGG CCGATGTGGC CCGGTCAGCT GGAGCTTTGG ATCTTTTTTTTTTTTTTTCT 5340 TTTTTCCTCC AGAAGCCCTC TCTGTCCCTG TCACCGGGGG CCCTGTACGTCTGAGGCCGA 5400 GGGAAAGCTA TGGGCGCGGT TTTCTTTCAT TGACCTGTCG GTCTTATCAGTTCTCCGTGA 5460 TGTCAGGGTC GACCAGTTGT TCCTTTGAGG TCCGGTTCTT TTCGTTATGGGGTCATTTTT 5520 GGGCCACCTC CCCAGGTATG ACTTCCAGGC GTCGTTGCTC GCCTGTCACTTTCCTCCCTG 5580 TCTCTTTTAT GCTTGTGATC TTTTCTATCT GTTCCTATTG GACCTGGAGATAGGTACTGA 5640 CACGCTGTCC TTTCCCTATT AACACTAAAG GACACTATAA AGAGACCCTTTCGATTTAAG 5700 GCTGTTTTGC TTGTCCAGCC TATTCTTTTT ACTGGCTTGG GTCTGTCGCGGTGCCTGAAG 5760 CTGTCCCCGA GCCACGCTTC CTGCTTTCCC GGGCTTGCTG CTTGCGTGTGCTTGCTGTGG 5820 GCAGCTTGTG ACAACTGGGC GCTGTGACTT TGCTGCGTGT CAGACGTTTTTCCCGATTTC 5880 CCCGAGGTGT CGTTGTCACA CCTGTCCCGG TTGGAATGGT GGAGCCAGCTGTGGTTGAGG 5940 GCCACCTTAT TTCGGCTCAC TTTTTTTTTT TTTTTTTCTC TTGGAGTCCCGAACCTCCGC 6000 TCTTTTCTCT TCCCGGTCTT TCTTCCACAT GCCTCCCGAG TGCATTTCTTTTTGTTTTTT 6060 TTCTTTTTTT TTTTTTTTTT TTGGGGAGGT GGAGAGTCCC GAGTACTTCACTCCTGTCTG 6120 TGGTGTCCAA GTGTTCATGC CACGTGCCTC CCGAGTGCAC TTTTTTTTGTGGCAGTCGCT 6180 CGTTGTGTTC TCTTGTTCTG TGTCTGCCCG TATCAGTAAC TGTCTTGCCCCGCGTGTAAG 6240 ACATTCCTAT CTCGCTTGTT TCTCCCGATT GCGCGTCGTT GCTCACTCTTAGATCGATGT 6300 GGTGCTCCGG AGTTCTCTTC GGGCCAGGGC CAAGCCGCGC CAGGCGAGGGACGGACATTC 6360 ATGGCGAATG GCGGCCGCTC TTCTCGTTCT GCCAGCGGGC CCTCGTCTCTCCACCCCATC 6420 CGTCTGCCGG TGGTGTGTGG AAGGCAGGGG TGCGGCTCTC CGGCCCGACGCTGCCCCGCG 6480 CGCACTTTTC TCAGTGGTTC GCGTGGTCCT TGTGGATGTG TGAGGCGCCCGGTTGTGCCC 6540 TCACGTGTTT CACTTTGGTC GTGTCTCGCT TGACCATGTT CCCAGAGTCGGTGGATGTGG 6600 CCGGTGGCGT TGCATACCCT TCCCGTCTGG TGTGTGCACG CGCTGTTTCTTGTAAGCGTC 6660 GAGGTGCTCC TGGAGCGTTC CAGGTTTGTC TCCTAGGTGC CTGCTTCTGAGCTGGTGGTG 6720 GCGCTCCCCA TTCCCTGGTG TGCCTCCGGT GCTCCGTCTG GCTGTGTGCCTTCCCGTTTG 6780 TGTCTGAGAA GCCCGTGAGA GGGGGGTCGA GGAGAGAAGG AGGGGCAAGACCCCCCTTCT 6840 TCGTCGGGTG AGGCGCCCAC CCCGCGACTA GTACGCCTGT GCGTAGGGCTGGTGCTGAGC 6900 GGTCGCGGCT GGGGTTGGAA AGTTTCTCGA GAGACTCATT GCTTTCCCGTGGGGAGCTTT 6960 GAGAGGCCTG GCTTTCGGGG GGGACCGGTT GCAGGGTCTC CCCTGTCCGCGGATGCTCAG 7020 AATGCCCTTG GAAGAGAACC TTCCTGTTGC CGCAGACCCC CCCGCGCGGTCGCCCGCGTG 7080 TTGGTCTTCT GGTTTCCCTG TGTGCTCGTC GCATGCATCC TCTCTCGGTGGCCGGGGCTC 7140 GTCGGGGTTT TGGGTCCGTC CCGCCCTCAG TGAGAAAGTT TCCTTCTCTAGCTATCTTCC 7200 GGAAAGGGTG CGGGCTTCTT ACGGTCTCGA GGGGTCTCTC CCGAATGGTCCCCTGGAGGG 7260 CTCGCCCCCT GACCGCCTCC CGCGCGCGCA GCGTTTGCTC TCTCGTCTACCGCGGCCCGC 7320 GGCCTCCCCG CTCCGAGTTC GGGGAGGGAT CACGCGGGGC AGAGCCTGTCTGTCGTCCTG 7380 CCGTTGCTGC GGAGCATGTG GCTCGGCTTG TGTGGTTGGT GGCTGGGGAGAGGGCTCCGT 7440 GCACACCCCC GCGTGCGCGT ACTTTCCTCC CCTCCTGAGG GCCGCCGTGCGGACGGGGTG 7500 TGGGTAGGCG ACGGTGGGCT CCCGGGTCCC CACCCGTCTT CCCGTGCCTCACCCGTGCCT 7560 TCCGTCGCGT GCGTCCCTCT CGCTCGCGTC CACGACTTTG GCCGCTCCCGCGACGGCGGC 7620 CTGCGCCGCG CGTGGTGCGT GCTGTGTGCT TCTCGGGCTG TGTGGTTGTGTCGCCTCGCC 7680 CCCCCCTTCC CGCGGCAGCG TTCCCACGGC TGGCGAAATC GCGGGAGTCCTCCTTCCCCT 7740 CCTCGGGGTC GAGAGGGTCC GTGTCTGGCG TTGATTGATC TCGCTCTCGGGGACGGGACC 7800 GTTCTGTGGG AGAACGGCTG TTGGCCGCGT CCGGCGCGAC GTCGGACGTGGGGACCCACT 7860 GCCGCTCGGG GGTCTTCGTC GGTAGGCATC GGTGTGTCGG CATCGGTCTCTCTCTCGTGT 7920 CGGTGTCGCC TCCTCGGGCT CCCGGGGGGC CGTCGTGTTT CGGGTCGGCTCGGCGCTGCA 7980 GGTGTGGTGG GACTGCTCAG GGGAGTGGTG CAGTGTGATT CCCGCCGGTTTTGCCTCGCG 8040 TGCCCTGACC GGTCCGACGC CCGAGCGGTC TCTCGGTCCC TTGTGAGGACCCCCTTCCGG 8100 GAGGGGCCCG TTTCGGCCGC CCTTGCCGTC GTCGCCGGCC CTCGTTCTGCTGTGTCGTTC 8160 CCCCCTCCCC GCTCGCCGCA GCCGGTCTTT TTTCCTCTCT CCCCCCCTCTCCTCTGACTG 8220 ACCCGTGGCC GTGCTGTCGG ACCCCCCGCA TGGGGGCGGC CGGGCACGTACGCGTCCGGG 8280 CGGTCACCGG GGTCTTGGGG GGGGGCCGAG GGGTAAGAAA GTCGGCTCGGCGGGCGGGAG 8340 GAGCTGTGGT TTGGAGGGCG TCCCGGCCCC GCGGCCGTGG CGGTGTCTTGCGCGGTCTTG 8400 GAGAGGGCTG CGTGCGAGGG GAAAAGGTTG CCCCGCGAGG GCAAAGGGAAAGAGGCTAGC 8460 AGTGGTCATT GTCCCGACGG TGTGGTGGTC TGTTGGCCGA GGTGCGTCTGGGGGGCTCGT 8520 CCGGCCCTGT CGTCCGTCGG GAAGGCGCGT GTTGGGGCCT GCCGGAGTGCCGAGGTGGGT 8580 ACCCTGGCGG TGGGATTAAC CCCGCGCGCG TGTCCCGGTG TGGCGGTGGGGGCTCCGGTC 8640 GATGTCTACC TCCCTCTCCC CGAGGTCTCA GGCCTTCTCC GCGCGGGCTCTCGGCCCTCC 8700 CCTCGTTCCT CCCTCTCGCG GGGTTCAAGT CGCTCGTCGA CCTCCCCTCCTCCGTCCTTC 8760 CATCTCTCGC GCAATGGCGC CGCCCGAGTT CACGGTGGGT TCGTCCTCCGCCTCCGCTTC 8820 TCGCCGGGGG CTGGCCGCTG TCCGGTCTCT CCTGCCCGAC CCCCGTTGGCGTGGTCTTCT 8880 CTCGCCGGCT TCGCGGACTC CTGGCTTCGC CCGGAGGGTC AGGGGGCTTCCCGGTTCCCC 8940 GACGTTGCGC CTCGCTGCTG TGTGCTTGGG GGGGGCCCGC TGCGGCCTCCGCCCGCCCGT 9000 GAGCCCCTGC CGCACCCGCC GGTGTGCGGT TTCGCGCCGC GGTCAGTTGGGCCCTGGCGT 9060 TGTGTCGCGT CGGGAGCGTG TCCGCCTCGC GGCGGCTAGA CGCGGGTGTCGCCGGGCTCC 9120 GACGGGTGGC CTATCCAGGG CTCGCCCCCG CCGACCCCCG CCTGCCCGTCCCGGTGGTGG 9180 TCGTTGGTGT GGGGAGTGAA TGGTGCTACC GGTCATTCCC TCCCGCGTGGTTTGACTGTC 9240 TCGCCGGTGT CGCGCTTCTC TTTCCGCCAA CCCCCACGCC AACCCACCACCCTGCTCTCC 9300 CGGCCCGGTG CGGTCGACGT TCCGGCTCTC CCGATGCCGA GGGGTTCGGGATTTGTGCCG 9360 GGGACGGAGG GGAGAGCGGG TAAGAGAGGT GTCGGAGAGC TGTCCCGGGGCGACGCTCGG 9420 GTTGGCTTTG CCGCGTGCGT GTGCTCGCGG ACGGGTTTTG TCGGACCCCGACGGGGTCGG 9480 TCCGGCCGCA TGCACTCTCC CGTTCCGCGC GAGCGCCCGC CCGGCTCACCCCCGGTTTGT 9540 CCTCCCGCGA GGCTCTCCGC CGCCGCCGCC TCCTCCTCCT CTCTCGCGCTCTCTGTCCCG 9600 CCTGGTCCTG TCCCACCCCC GACGCTCCGC TCGCGCTTCC TTACCTGGTTGATCCTGCCA 9660 GGTAGCATAT GCTTGTCTCA AAGATTAAGC CATGCATGTC TAAGTACGCACGGCCGGTAC 9720 AGTGAAACTG CGAATGGCTC ATTAAATCAG TTATGGTTCC TTTGGTCGCTCGCTCCTCTC 9780 CTACTTGGAT AACTGTGGTA ATTCTAGAGC TAATACATGC CGACGGGCGCTGACCCCCCT 9840 TCCCGGGGGG GGATGCGTGC ATTTATCAGA TCAAAACCAA CCCGGTGAGCTCCCTCCCGG 9900 CTCCGGCCGG GGGTCGGGCG CCGGCGGCTT GGTGACTCTA GATAACCTCGGGCCGATCGC 9960 ACGCCCCCCG TGGCGGCGAC GACCCATTCG AACGTCTGCC CTATCAACTTTCGATGGTAG 10020 TCGCCGTGCC TACCATGGTG ACCACGGGTG ACGGGGAATC AGGGTTCGATTCCGGAGAGG 10080 GAGCCTGAGA AACGGCTACC ACATCCAAGG AAGGCAGCAG GCGCGCAAATTACCCACTCC 10140 CGACCCGGGG AGGTAGTGAC GAAAAATAAC AATACAGGAC TCTTTCGAGGCCCTGTAATT 10200 GGAATGAGTC CACTTTAAAT CCTTTAACGA GGATCCATTG GAGGGCAAGTCTGGTGCCAG 10260 CAGCCGCGGT AATTCCAGCT CCAATAGCGT ATATTAAAGT TGCTGCAGTTAAAAAGCTGG 10320 TAGTTGGATC TTGGGAGCGG GCGGGCGGTC CGCCGCGAGG CGAGTCACCGCCCGTCCCCG 10380 CCCCTTGCCT CTCGGCGCCC CCTCGATGCT CTTAGCTGAG TGTCCCGCGGGGCCCGAAGC 10440 GTTTACTTTG AAAAAATTAG AGTGTTCAAA GCAGGCCCGA GCCGCCTGGATACCGCAGCT 10500 AGGAATAATG GAATAGGACC GCGGTTCTAT TTTGTTGGTT TTCGGAACTGAGGCCATGAT 10560 TAAGAGGGAC GGCCGGGGGC ATTCGTATTG CGCCGCTAGA GGTGAAATTCTTGGACCGGC 10620 GCAAGACGGA CCAGAGCGAA AGCATTTGCC AAGAATGTTT TCATTAATCAAGAACGAAAG 10680 TCGGAGGTTC GAAGACGATC AGATACCGTC GTAGTTCCGA CCATAAACGATGCCGACTGG 10740 CGATGCGGCG GCGTTATTCC CATGACCCGC CGGGCAGCTT CCGGGAAACCAAAGTCTTTG 10800 GGTTCCGGGG GGAGTATGGT TGCAAAGCTG AAACTTAAAG GAATTGACGGAAGGGCACCA 10860 CCAGGAGTGG GCCTGCGGCT TAATTTGACT CAACACGGGA AACCTCACCCGGCCCGGACA 10920 CGGACAGGAT TGACAGATTG ATAGCTCTTT CTCGATTCCG TGGGTGGTGGTGCATGGCCG 10980 TTCTTAGTTG GTGGAGCGAT TTGTCTGGTT AATTCCGATA ACGAACGAGACTCTGGCATG 11040 CTAACTAGTT ACGCGACCCC CGAGCGGTCG GCGTCCCCCA ACTTCTTAGAGGGACAAGTG 11100 GCGTTCAGCC ACCCGAGATT GAGCAATAAC AGGTCTGTGA TGCCCTTAGATGTCCGGGGC 11160 TGCACGCGCG CTACACTGAC TGGCTCAGCG TGTGCCTACC CTGCGCCGGCAGGCGCGGGT 11220 AACCCGTTGA ACCCCATTCG TGATGGGGAT CGGGGATTGC AATTATTCCCCATGAACGAG 11280 GAATTCCCAG TAAGTGCGGG TCATAAGCTT GCGTTGATTA AGTCCCTGCCCTTTGTACAC 11340 ACCGCCCGTC GCTACTACCG ATTGGATGGT TTAGTGAGGC CCTCGGATCGGCCCCGCCGG 11400 GGTCGGCCCA CGGCCCTGGC GGAGCGCTGA GAAGACGGTC GAACTTGACTATCTAGAGGA 11460 AGTAAAAGTC GTAACAAGGT TTCCGTAGGT GAACCTGCGG AAGGATCATTAAACGGGAGA 11520 CTGTGGAGGA GCGGCGGCGT GGCCCGCTCT CCCCGTCTTG TGTGTGTCCTCGCCGGGAGG 11580 CGCGTGCGTC CCGGGTCCCG TCGCCCGCGT GTGGAGCGAG GTGTCTGGAGTGAGGTGAGA 11640 GAAGGGGTGG GTGGGGTCGG TCTGGGTCCG TCTGGGACCG CCTCCGATTTCCCCTCCCCC 11700 TCCCCTCTCC CTCGTCCGGC TCTGACCTCG CCACCCTACC GCGGCGGCGGCTGCTCGCGG 11760 GCGTCTTGCC TCTTTCCCGT CCGGCTCTTC CGTGTCTACG AGGGGCGGTACGTCGTTACG 11820 GGTTTTTGAC CCGTCCCGGG GGCGTTCGGT CGTCGGGGCG CGCGCTTTGCTCTCCCGGCA 11880 CCCATCCCCG CCGCGGCTCT GGCTTTTCTA CGTTGGCTGG GGCGGTTGTCGCGTGTGGGG 11940 GGATGTGAGT GTCGCGTGTG GGCTCGCCCG TCCCGATGCC ACGCTTTTCTGGCCTCGCGT 12000 GTCCTCCCCG CTCCTGTCCC GGGTACCTAG CTGTCGCGTT CCGGCGCGGAGGTTTAAGGA 12060 CCCCGGGGGG GTCGCCCTGC CGCCCCCAGG GTCGGGGGGC GGTGGGGCCCGTAGGGAAGT 12120 CGGTCGTTCG GGCGGCTCTC CCTCAGACTC CATGACCCTC CTCCCCCCGCTGCCGCCGTT 12180 CCCGAGGCGG CGGTCGTGTG GGGGGGTGGA TGTCTGGAGC CCCCTCGGGCGCCGTGGGGG 12240 CCCGACCCGC GCCGCCGGCT TGCCCGATTT CCGCGGGTCG GTCCTGTCGGTGCCGGTCGT 12300 GGGTTCCCGT GTCGTTCCCG TGTTTTTCCG CTCCCGACCC TTTTTTTTTCCTCCCCCCCA 12360 CACGTGTCTC GTTTCGTTCC TGCTGGCCGG CCTGAGGCTA CCCCTCGGTCCATCTGTTCT 12420 CCTCTCTCTC CGGGGAGAGG AGGGCGGTGG TCGTTGGGGG ACTGTGCCGTCGTCAGCACC 12480 CGTGAGTTCG CTCACACCCG AAATACCGAT ACGACTCTTA GCGGTGGATCACTCGGCTCG 12540 TGCGTCGATG AAGAACGCAG CTAGCTGCGA GAATTAATGT GAATTGCAGGACACATTGAT 12600 CATCGACACT TCGAACGCAC TTGCGGCCCC GGGTTCCTCC CGGGGCTACGCCTGTCTGAG 12660 CGTCGGTTGA CGATCAATCG CGTCACCCGC TGCGGTGGGT GCTGCGCGGCTGGGAGTTTG 12720 CTCGCAGGGC CAACCCCCCA ACCCGGGTCG GGCCCTCCGT CTCCCGAAGTTCAGACGTGT 12780 GGGCGGTTGT CGGTGTGGCG CGCGCGCCCG CGTCGCGGAG CCTGGTCTCCCCCGCGCATC 12840 CGCGCTCGCG GCTTCTTCCC GCTCCGCCGT TCCCGCCCTC GCCCGTGCACCCCGGTCCTG 12900 GCCTCGCGTC GGCGCCTCCC GGACCGCTGC CTCACCAGTC TTTCTCGGTCCCGTGCCCCG 12960 TGGGAACCCA CCGCGCCCCC GTGGCGCCCG GGGGTGGGCG CGTCCGCATCTGCTCTGGTC 13020 GAGGTTGGCG GTTGAGGGTG TGCGTGCGCC GAGGTGGTGG TCGGTCCCCTGCGGCCGCGG 13080 GGTTGTCGGG GTGGCGGTCG ACGAGGGCCG GTCGGTCGCC TGCGGTGGTTGTCTGTGTGT 13140 GTTTGGGTCT TGCGCTGGGG GAGGCGGGGT CGACCGCTCG CGGGGTTGGCGCGGTCGCCC 13200 GGCGCCGCGC ACCCTCCGGC TTGTGTGGAG GGAGAGCGAG GGCGAGAACGGAGAGAGGTG 13260 GTATCCCCGG TGGCGTTGCG AGGGAGGGTT TGGCGTCCCG CGTCCGTCCGTCCCTCCCTC 13320 CCTCGGTGGG CGCCTTCGCG CCGCACGCGG CCGCTAGGGG CGGTCGGGGCCCGTGGCCCC 13380 CGTGGCTCTT CTTCGTCTCC GCTTCTCCTT CACCCGGGCG GTACCCGCTCCGGCGCCGGC 13440 CCGCGGGACG CCGCGGCGTC CGTGCGCCGA TGCGAGTCAC CCCCGGGTGTTGCGAGTTCG 13500 GGGAGGGAGA GGGCCTCGCT GACCCGTTGC GTCCCGGCTT CCCTGGGGGGGACCCGGCGT 13560 CTGTGGGCTG TGCGTCCCGG GGGTTGCGTG TGAGTAAGAT CCTCCACCCCCGCCGCCCTC 13620 CCCTCCCGCC GGCCTCTCGG GGACCCCCTG AGACGGTTCG CCGGCTCGTCCTCCCGTGCC 13680 GCCGGGTGCC GTCTCTTTCC CGCCCGCCTC CTCGCTCTCT TCTTCCCGCGGCTGGGCGCG 13740 TGTCCCCCCT TTCTGACCGC GACCTCAGAT CAGACGTGGC GACCCGCTGAATTTAAGCAT 13800 ATTAGTCAGC GGAGGAAAAG AAACTAACCA GGATTCCCTC AGTAACGGCGAGTGAACAGG 13860 GAAGAGCCCA GCGCCGAATC CCCGCCGCGC GTCGCGGCGT GGGAAATGTGGCGTACGGAA 13920 GACCCACTCC CCGGCGCCGC TCGTGGGGGG CCCAAGTCCT TCTGATCGAGGCCCAGCCCG 13980 TGGACGGTGT GAGGCCGGTA GCGGCCCCGG CGCGCCGGGC TCGGGTCTTCCCGGAGTCGG 14040 GTTGCTTGGG AATGCAGCCC AAAGCGGGTG GTAAACTCCA TCTAAGGCTAAATACCGGCA 14100 CGAGACCGAT AGTCAACAAG TACCGTAAGG GAAAGTTGAA AAGAACTTTGAAGAGAGAGT 14160 TCAAGAGGGC GTGAAACCGT TAAGAGGTAA ACGGGTGGGG TCCGCGCAGTCCGCCCGGAG 14220 GATTCAACCC GGCGGCGCGC GTCCGGCCGT GCCCGGTGGT CCCGGCGGATCTTTCCCGCT 14280 CCCCGTTCCT CCCGACCCCT CCACCCGCGC GTCGTTCCCC TCTTCCTCCCCGCGTCCGGC 14340 GCCTCCGGCG GCGGGCGCGG GGGGTGGTGT GGTGGTGGCG CGCGGGCGGGGCCGGGGGTG 14400 GGGTCGGCGG GGGACCGCCC CCGGCCGGCG ACCGGCCGCC GCCGGGCGCACTTCCACCGT 14460 GGCGGTGCGC CGCGACCGGC TCCGGGACGG CCGGGAAGGC CCGGTGGGGAAGGTGGCTCG 14520 GGGGGGGCGG CGCGTCTCAG GGCGCGCCGA ACCACCTCAC CCCGAGTGTTACAGCCCTCC 14580 GGCCGCGCTT TCGCCGAATC CCGGGGCCGA GGAAGCCAGA TACCCGTCGCCGCGCTCTCC 14640 CTCTCCCCCC GTCCGCCTCC CGGGCGGGCG TGGGGGTGGG GGCCGGGCCGCCCCTCCCAC 14700 GGCGCGACCG CTCTCCCACC CCCCTCCGTC GCCTCTCTCG GGGCCCGGTGGGGGGCGGGG 14760 CGGACTGTCC CCAGTGCGCC CCGGGCGTCG TCGCGCCGTC GGGTCCCGGGGGGACCGTCG 14820 GTCACGCGTC TCCCGACGAA GCCGAGCGCA CGGGGTCGGC GGCGATGTCGGCTACCCACC 14880 CGACCCGTCT TGAAACACGG ACCAAGGAGT CTAACGCGTG CGCGAGTCAGGGGCTCGTCC 14940 GAAAGCCGCC GTGGCGCAAT GAAGGTGAAG GGCCCCGCCC GGGGGCCCGAGGTGGGATCC 15000 CGAGGCCTCT CCAGTCCGCC GAGGGCGCAC CACCGGCCCG TCTCGCCCGCCGCGCCGGGG 15060 AGGTGGAGCA CGAGCGTACG CGTTAGGACC CGAAAGATGG TGAACTATGCTTGGGCAGGG 15120 CGAAGCCAGA GGAAACTCTG GTGGAGGTCC GTAGCGGTCC TGACGTGCAAATCGGTCGTC 15180 CGACCTGGGT ATAGGGGCGA AAGACTAATC GAACCATCTA GTAGCTGGTTCCCTCCGAAG 15240 TTTCCCTCAG GATAGCTGGC GCTCTCGCTC CCGACGTACG CAGTTTTATCCGGTAAAGCG 15300 AATGATTAGA GGTCTTGGGG CCGAAACGAT CTCAACCTAT TCTCAAACTTTAAATGGGTA 15360 AGAAGCCCGG CTCGCTGGCG TGGAGCCGGG CGTGGAATGC GAGTGCCTAGTGGGCCACTT 15420 TTGGTAAGCA GAACTGGCGC TGCGGGATGA ACCGAACGCC GGGTTAAGGCGCCCGATGCC 15480 GACGCTCATC AGACCCCAGA AAAGGTGTTG GTTGATATAG ACAGCAGGACGGTGGCCATG 15540 GAAGTCGGAA TCCGCTAAGG AGTGTGTAAC AACTCACCTG CCGAATCAACTAGCCCTGAA 15600 AATGGATGGC GCTGGAGCGT CGGGCCCATA CCCGGCCGTC GCCGCAGTCGGAACGGAACG 15660 GGACGGGAGC GGCCGCGGGT GCGCGTCTCT CGGGGTCGGG GGTGCGTGGCGGGGGCCCGT 15720 CCCCCGCCTC CCCTCCGCGC GCCGGGTTCG CCCCCGCGGC GTCGGGCCCCGCGGAGCCTA 15780 CGCCGCGACG AGTAGGAGGG CCGCTGCGGT GAGCCTTGAA GCCTAGGGCGCGGGCCCGGG 15840 TGGAGCCGCC GCAGGTGCAG ATCTTGGTGG TAGTAGCAAA TATTCAAACGAGAACTTTGA 15900 AGGCCGAAGT GGAGAAGGGT TCCATGTGAA CAGCAGTTGA ACATGGGTCAGTCGGTCCTG 15960 AGAGATGGGC GAGTGCCGTT CCGAAGGGAC GGGCGATGGC CTCCGTTGCCCTCGGCCGAT 16020 CGAAAGGGAG TCGGGTTCAG ATCCCCGAAT CCGGAGTGGC GGAGATGGGCGCCGCGAGGC 16080 CAGTGCGGTA ACGCGACCGA TCCCGGAGAA GCCGGCGGGA GGCCTCGGGGAGAGTTCTCT 16140 TTTCTTTGTG AAGGGCAGGG CGCCCTGGAA TGGGTTCGCC CCGAGAGAGGGGCCCGTGCC 16200 TTGGAAAGCG TCGCGGTTCC GGCGGCGTCC GGTGAGCTCT CGCTGGCCCTTGAAAATCCG 16260 GGGGAGAGGG TGTAAATCTC GCGCCGGGCC GTACCCATAT CCGCAGCAGGTCTCCAAGGT 16320 GAACAGCCTC TGGCATGTTG GAACAATGTA GGTAAGGGAA GTCGGCAAGCCGGATCCGTA 16380 ACTTCGGGAT AAGGATTGGC TCTAAGGGCT GGGTCGGTCG GGCTGGGGCGCGAAGCGGGG 16440 CTGGGCGCGC GCCGCGGCTG GACGAGGCGC CGCCGCCCTC TCCCACGTCCGGGGAGACCC 16500 CCCGTCCTTT CCGCCCGGGC CCGCCCTCCC CTCTTCCCCG CGGGGCCCCGTCGTCCCCCG 16560 CGTCGTCGCC ACCTCTCTTC CCCCCTCCTT CTTCCCGTCG GGGGGCGGGTCGGGGGTCGG 16620 CGCGCGGCGC GGGCTCCGGG GCGGCGGGTC CAACCCCGCG GGGGTTCCGGAGCGGGAGGA 16680 ACCAGCGGTC CCCGGTGGGG CGGGGGGCCC GGACACTCGG GGGGCCGGCGGCGGCGGCGA 16740 CTCTGGACGC GAGCCGGGCC CTTCCCGTGG ATCGCCTCAG CTGCGGCGGGCGTCGCGGCC 16800 GCTCCCGGGG AGCCCGGCGG GTGCCGGCGC GGGTCCCCTC CCCGCGGGGCCTCGCTCCAC 16860 CCCCCCATCG CCTCTCCCGA GGTGCGTGGC GGGGGCGGGC GGGCGTGTCCCGCGCGTGTG 16920 GGGGGAACCT CCGCGTCGGT GTTCCCCCGC CGGGTCCGCC CCCCGGGCCGCGGTTTTCCG 16980 CGCGGCGCCC CCGCCTCGGC CGGCGCCTAG CAGCCGACTT AGAACTGGTGCGGACCGGGG 17040 GAATCCGACT GTTTAATTAA AACAAAGCAT CGCGAAGGCC CGCGGCGGGTGTTGACGCGA 17100 TGTGATTTCT GCCCAGTGCT CTGAATGTCA AAGTGAAGAA ATTCAATGAAGCGCGGGTAA 17160 ACGGCGGGAG TAACTATGAC TCTCTTAAGG TAGCCAAATG CCTCGTCATCTAATTAGTAA 17220 CGCGCATGAA TGGATGAACG AGATTCCCAC TGTCCCTACC TACTATCCAGCGAAACCACA 17280 GCCAAGGGAA CGGGCTTGGC GGAATCAGCG GGGAAAGAAG ACCCTGTTGAGCTTGACTCT 17340 AGTCTGGCAC GGTGAAGAGA CATGAGAGGT GTAGAATAAG TGGGAGGCCCCCGGCGCCCG 17400 GCCCCGTCCT CGCGTCGGGG TCGGGGCACG CCGGCCTCGC GGGCCGCCGGTGAAATACCA 17460 CTACTCTCAT CGTTTTTTCA CTGACCCGGT GAGGCGGGGG GGCGAGCCCCGAGGGGCTCT 17520 CGCTTCTGGC GCCAAGCGTC CGTCCCGCGC GTGCGGGCGG GCGCGACCCGCTCCGGGGAC 17580 AGTGCCAGGT GGGGAGTTTG ACTGGGGCGG TACACCTGTC AAACGGTAACGCAGGTGTCC 17640 TAAGGCGAGC TCAGGGAGGA CAGAAACCTC CCGTGGAGCA GAAGGGCAAAAGCTCGCTTG 17700 ATCTTGATTT TCAGTACGAA TACAGACCGT GAAAGCGGGG CCTCACGATCCTTCTGACCT 17760 TTTGGGTTTT AAGCAGGAGG TGTCAGAAAA GTTACCACAG GGATAACTGGCTTGTGGCGG 17820 CCAAGCGTTC ATAGCGACGT CGCTTTTTGA TCCTTCGATG TCGGCTCTTCCTATCATTGT 17880 GAAGCAGAAT TCACCAAGCG TTGGATTGTT CACCCACTAA TAGGGAACGTGAGCTGGGTT 17940 TAGACCGTCG TGAGACAGGT TAGTTTTACC CTACTGATGA TGTGTTGTTGCCATGGTAAT 18000 CCTGCTCAGT ACGAGAGGAA CCGCAGGTTC AGACATTTGG TGTATGTGCTTGGCTGAGGA 18060 GCCAATGGGG CGAAGCTACC ATCTGTGGGA TTATGACTGA ACGCCTCTAAGTCAGAATCC 18120 GCCCAAGCGG AACGATACGG CAGCGCCGAA GGAGCCTCGG TTGGCCCCGGATAGCCGGGT 18180 CCCCGTCCGT CCCGCTCGGC GGGGTCCCCG CGTCGCCCCG CGGCGGCGCGGGGTCTCCCC 18240 CCGCCGGGCG TCGGGACCGG GGTCCGGTGC GGAGAGCCGT TCGTCTTGGGAAACGGGGTG 18300 CGGCCGGAAA GGGGGCCGCC CTCTCGCCCG TCACGTTGAA CGCACGTTCGTGTGGAACCT 18360 GGCGCTAAAC CATTCGTAGA CGACCTGCTT CTGGGTCGGG GTTTCGTACGTAGCAGAGCA 18420 GCTCCCTCGC TGCGATCTAT TGAAAGTCAG CCCTCGACAC AAGGGTTTGTCTCTGCGGGC 18480 TTTCCCGTCG CACGCCCGCT CGCTCGCACG CGACCGTGTC GCCGCCCGGGCGTCACGGGG 18540 GCGGTCGCCT CGGCCCCCGC GCGGTTGCCC GAACGACCGT GTGGTGGTTGGGGGGGGGAT 18600 CGTCTTCTCC TCCGTCTCCC GAGGACGGTT CGTTTCTCTT TCCCCTTCCGTCGCTCTCCT 18660 TGGGTGTGGG AGCCTCGTGC CGTCGCGACC GCGGCCTGCC GTCGCCTGCCGCCGCAGCCC 18720 CTTGCCCTCC GGCCTTGGCC AAGCCGGAGG GCGGAGGAGG GGGATCGGCGGCGGCGGCGA 18780 CCGCGGCGCG GTGACGCACG GTGGGATCCC CATCCTCGGC GCGTCCGTCGGGGACGGCCG 18840 GTTGGAGGGG CGGGAGGGGT TTTTCCCGTG AACGCCGCGT TCGGCGCCAGGCCTCTGGCG 18900 GCCGGGGGGG CGCTCTCTCC GCCCGAGCAT CCCCACTCCC GCCCCTCCTCTTCGCGCGCC 18960 GCGGCGGCGA CGTGCGTACG AGGGGAGGAT GTCGCGGTGT GGAGGCGGAGAGGGTCCGGC 19020 GCGGCGCCTC TTCCATTTTT TCCCCCCCAA CTTCGGAGGT CGACCAGTACTCCGGGCGAC 19080 ACTTTGTTTT TTTTTTTTCC CCCGATGCTG GAGGTCGACC AGATGTCCGAAAGTGTCCCC 19140 CCCCCCCCCC CCCCCCGGCG CGGAGCGGCG GGGCCACTCT GGACTCTTTTTTTTTTTTTT 19200 TTTTTTTTTT TTAAATTCCT GGAACCTTTA GGTCGACCAG TTGTCCGTCTTTTACTCCTT 19260 CATATAGGTC GACCAGTACT CCGGGTGGTA CTTTGTCTTT TTCTGAAAATCCCAGAGGTC 19320 GACCAGATAT CCGAAAGTCC TCTCTTTCCC TTTACTCTTC CCCACAGCGATTCTCTTTTT 19380 TTTTTTTTTT TTTGGTGTGC CTCTTTTTGA CTTATATACA TGTAAATAGTGTGTACGTTT 19440 ATATACTTAT AGGAGGAGGT CGACCAGTAC TCCGGGCGAC ACTTTGTTTTTTTTTTTTTT 19500 TCCACCGATG ATGGAGGTCG ACCAGATGTC CGAAAGTGTC CCGTCCCCCCCCTCCCCCCC 19560 CCGCGACGCG GCGGGCTCAC TCTGGACTCT TTTTTTTTTT TTTTTTTTTTTTTAAATTTC 19620 TGGAACCTTA AGGTCGACCA GTTGTCCGTC TTTCACTCAT TCATATAGGTCGACCGGTGG 19680 TACTTTGTCT TTTTCTGAAA ATCGCAGAGG TCGACCAGAT GTCAGAAAGTCTGGTGGTCG 19740 ATAAATTATC TGATCTAGAT TTGTTTTTCT GTTTTTCAGT TTTGTGTTGTTTTGTGTTGT 19800 TTTGTGTTGT TTTGTTTTGT TTTGTTTTGT TTTGTTTTGT TTTGTTTTGTTTTGTTTTGT 19860 TTTGTGTTGT GTTGTGTTGT GTTGTGTTGG GTTGGGTTGG GTTGGGTTGGGTTGGGTTGG 19920 GTTGGGTTGG GTTGGGTTGT GTTGTTTGGT TTTGTGTTGT TTGGTGTTGTTGGTTTTGTT 19980 TTGTTTGCTG TTGTTTTGTG TTTTGCGGGT CGAACAGTTG TCCCTAACCGAGTTTTTTTG 20040 TACACAAACA TGCACTTTTT TTAAAATAAA TTTTTAAAAT AAATGCGAAAATCGACCAAT 20100 TATCCCTTTC CTTCTCTCTC TTTTTTAAAA ATTTTCTTTG TGTGTGTGTGTGTGTGTGTG 20160 TGTGTGTGTG TGCGTGTGTG TGTGTGTGTG CGTGCAGCGT GCGCGCGCTCGTTTTATAAA 20220 TACTTATAAT AATAGGTCGC CGGGTGGTGG TAGCTTCCCG GACTCCAGAGGCAGAGGCAG 20280 GCAGACTTCT GAGTTCGAGG CCAGCCTGGT CTACAGAGGA ACCCTGTCTCGAAAAATGAA 20340 AATAAATACA TACATACATA CATACATACA TACATACATA CATACATACATACATATGAG 20400 GTTGACCAGT TGTCAATCCT TTAGAATTTT GTTTTTAATT AATGTGATAGAGAGATAGAT 20460 AATAGATAGA TGGATAGAGT GATACAAATA TAGGTTTTTT TTTCAGTAAATATGAGGTTG 20520 ATTAACCACT TTTCCCTTTT TAGGTTTTTT TTTTTTTCCC CTGTCCATGTGGTTGCTGGG 20580 ATTTGAACTC AGGACCCTGG CAGGTCAACT GGAAAACGTG TTTTCTATATATATAAATAG 20640 TGGTCTGTCT GCTGTTTGTT TGTTTGCTTG CTTGCTTGCT TGCTTGCTTGCTTGCTTGCT 20700 TGCTTTTTTT TTTCTTCTGA GACAGTATTT CTCTGTGTAA CCTGGTGCCCTGAAACTCAC 20760 TCTGTAGACC AGCCTGGCCT CAATCGAACT CAGAAATCCT CCTGCCTCTTGTCTACCTCC 20820 CAATTTTGGA GTAAAGGTGT GCTACACCAC TGCCTGGCAT TATTATCATTATCATTATTA 20880 ATTTTATTAT TAGACAGAAC GAAATCAACT AGTTGGTCCT GTTTCGTTAATTCATTTGAA 20940 ATTAGTTGGA CCAATTAGTT GGCTGGTTTG GGAGGTTTCT TTTGTTTCCGATTTGGGTGT 21000 TTGTGGGGCT GGGGATCAGG TATCTCAACG GAATGCATGA AGGTTAAGGTGAGATGGCTC 21060 GATTTTTGTA AAGATTACTT TTCTTAGTCT GAGGAAAAAA TAAAATAATATTGGGCTACG 21120 TTTCATTGCT TCATTTCTAT TTCTCTTTCT TTCTTTCTTT CTTTCAGATAAGGAGGTCGG 21180 CCAGTTCCTC CTGCCTTCTG GAAGATGTAG GCATTGCATT GGGAAAAGCATTGTTCGAGA 21240 GATGTGCTAG TGAACCAGAG AGTTTGGATG TCAAGCCGTA TAATGTTTATTACAATATAG 21300 AAAAGTTCTA ACAAAGTGAT CTTTAACTTT TTTTTTTTTT TTTCTCCTTCTACTTCTACT 21360 TGTTCTCACT CTGCCACCAA CGCGCTTTGT ACATTGAATG TGAGCTTTGTTTTGCTTAAC 21420 AGACATATAT TTTTTCTTTT GGTTTTGCTT GACATGGTTT CCCTTTCTATCCGTGCAGGG 21480 TTCCCAGACG GCCTTTTGAG AATAAAATGG GAGGCCAGAA CCAAAGTCTTTTGAATAAAG 21540 CACCACAACT CTAACCTGTT TGGCTGTTTT CCTTCCCAAG GCACAGATCTTTCCCAGCAT 21600 GGAAAAGCAT GTAGCAGTTG TAGGACACAC TAGACGAGAG CACCAGATCTCATTGTGGGT 21660 GGTTGTGAAC CACCCACCAT GTGGTTGCCT GGGATTTGAA CTCAGGATCTTCAGAAGACG 21720 AGTCAGGGCT CTAAACCGAT GAGCCATCTC TCCAGCCCTC CTACATTCCTTCTTAAGGCA 21780 TGAATGATCC CAGCATGGGA AGACAGTCTG CCCTCTTTGT GGTATATCACCATATACTCA 21840 ATAAAATAAT GAAATGAATG AAGTCTCCAC GTATTTATTT CTTCGAGCTATCTAAATTCT 21900 CTCACAGCAC CTCCCCCTCC CCCACACTGC CTTTCTCCCT ATGTTTGGGTGGGGCTGGGG 21960 GAGGGGTGGG GTGGGGGCAG GGATCTGCAT GTCTTCTTGC AGGTCTGTGAACTATTTGCG 22020 ATGGCCTGGT TCTCTGAACT GTTGAGCCTT GTCTATCCAG AGGCTGACTGGCTAGTTTTC 22080 TACCTGAAGT CCCTGAGTGA TGATTTCCCT GTGAATTC 22118 (2)INFORMATION FOR SEQ ID NO: 17: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:42999 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: GCTGACACGC TGTCCTCTGGCGACCTGTCG TCGGAGAGGT TGGGCCTCCG GATGCGCGCG 60 GGGCTCTGGC CTCACGGTGACCGGCTAGCC GGCCGCGCTC CTGCCTTGAG CCGCCTGCCG 120 CGGCCCGCGG GCCTGCTGTTCTCTCGCGCG TCCGAGCGTC CCGACTCCCG GTGCCGGCCC 180 GGGTCCGGGT CTCTGACCCACCCGGGGGCG GCGGGGAAGG CGGCGAGGGC CACCGTGCCC 240 CGTGCGCTCT CCGCTGCGGGCGCCCGGGGC GCCGCACAAC CCCACCCGCT GGCTCCGTGC 300 CGTGCGTGTC AGGCGTTCTCGTCTCCGCGG GGTTGTCCGC CGCCCCTTCC CCGGAGTGGG 360 GGGTGGCCGG AGCCGATCGGCTCGCTGGCC GGCCGGCCTC CGCTCCCGGG GGGCTCTTCG 420 ATCGATGTGG TGACGTCGTGCTCTCCCGGG CCGGGTCCGA GCCGCGACGG GCGAGGGGCG 480 GACGTTCGTG GCGAACGGGACCGTCCTTCT CGCTCCGCCC GCGCGGTCCC CTCGTCTGCT 540 CCTCTCCCCG CCCGCCGGCCGGCGTGTGGG AAGGCGTGGG GTGCGGACCC CGGCCCGACC 600 TCGCCGTCCC GCCCGCCGCCTTCGCTTCGC GGGTGCGGGC CGGCGGGGTC CTCTGACGCG 660 GCAGACAGCC CTGCCTGTCGCCTCCAGTGG TTGTCGACTT GCGGGCGGCC CCCCTCCGCG 720 GCGGTGGGGG TGCCGTCCCGCCGGCCCGTC GTGCTGCCCT CTCGGGGGGG GTTTGCGCGA 780 GCGTCGGCTC CGCCTGGGCCCTTGCGGTGC TCCTGGAGCG CTCCGGGTTG TCCCTCAGGT 840 GCCCGAGGCC GAACGGTGGTGTGTCGTTCC CGCCCCCGGC GCCCCCTCCT CCGGTCGCCG 900 CCGCGGTGTC CGCGCGTGGGTCCTGAGGGA GCTCGTCGGT GTGGGGTTCG AGGCGGTTTG 960 AGTGAGACGA GACGAGACGCGCCCCTCCCA CGCGGGGAAG GGCGCCCGCC TGCTCTCGGT 1020 GAGCGCACGT CCCGTGCTCCCCTCTGGCGG GTGCGCGCGG GCCGTGTGAG CGATCGCGGT 1080 GGGTTCGGGC CGGTGTGACGCGTGCGCCGG CCGGCCGCCG AGGGGCTGCC GTTCTGCCTC 1140 CGACCGGTCG TGTGTGGGTTGACTTCGGAG GCGCTCTGCC TCGGAAGGAA GGAGGTGGGT 1200 GGACGGGGGG GCCTGGTGGGGTTGCGCGCA CGCGCGCACC GGCCGGGCCC CCGCCCTGAA 1260 CGCGAACGCT CGAGGTGGCCGCGCGCAGGT GTTTCCTCGT ACCGCAGGGC CCCCTCCCTT 1320 CCCCAGGCGT CCCTCGGCGCCTCTGCGGGC CCGAGGAGGA GCGGCTGGCG GGTGGGGGGA 1380 GTGTGACCCA CCCTCGGTGAGAAAAGCCTT CTCTAGCGAT CTGAGAGGCG TGCCTTGGGG 1440 GTACCGGATC CCCCGGGCCGCCGCCTCTGT CTCTGCCTCC GTTATGGTAG CGCTGCCGTA 1500 GCGACCCGCT CGCAGAGGACCCTCCTCCGC TTCCCCCTCG ACGGGGTTGG GGGGGAGAAG 1560 CGAGGGTTCC GCCGGCCACCGCGGTGGTGG CCGAGTGCGG CTCGTCGCCT ACTGTGGCCC 1620 GCGCCTCCCC CTTCCGAGTCGGGGGAGGAT CCCGCCGGGC CGGGCCCGGC GCTCCCACCC 1680 AGCGGGTTGG GACGCGGCGGCCGGCGGGCG GTGGGTGTGC GCGCCCGGCG CTCTGTCCGG 1740 CGCGTGACCC CCTCCGTCCGCGAGTCGGCT CTCCGCCCGC TCCCGTGCCG AGTCGTGACC 1800 GGTGCCGACG ACCGCGTTTGCGTGGCACGG GGTCGGGCCC GCCTGGCCCT GGGAAAGCGT 1860 CCCACGGTGG GGGCGCGCCGGTCTCCCGGA GCGGGACCGG GTCGGAGGAT GGACGAGAAT 1920 CACGAGCGAC GGTGGTGGTGGCGTGTCGGG TTCGTGGCTG CGGTCGCTCC GGGGCCCCCG 1980 GTGGCGGGGC CCCGGGGCTCGCGAGGCGGT TCTCGGTGGG GGCCGAGGGC CGTCCGGCGT 2040 CCCAGGCGGG GCGCCGCGGGACCGCCCTCG TGTCTGTGGC GGTGGGATCC CGCGGCCGTG 2100 TTTTCCTGGT GGCCCGGCCGTGCCTGAGGT TTCTCCCCGA GCCGCCGCCT CTGCGGGCTC 2160 CCGGGTGCCC TTGCCCTCGCGGTCCCCGGC CCTCGCCCGT CTGTGCCCTC TTCCCCGCCC 2220 GCCGCCCGCC GATCCTCTTCTTCCCCCCGA GCGGCTCACC GGCTTCACGT CCGTTGGTGG 2280 CCCCGCCTGG GACCGAACCCGGCACCGCCT CGTGGGGCGC CGCCGCCGGC CACTGATCGG 2340 CCCGGCGTCC GCGTCCCCCGGCGCGCGCCT TGGGGACCGG GTCGGTGGCG CGCCGCGTGG 2400 GGCCCGGTGG GCTTCCCGGAGGGTTCCGGG GGTCGGCCTG CGGCGCGTGC GGGGGAGGAG 2460 ACGGTTCCGG GGGACCGGCCGCGGCTGCGG CGGCGGCGGT GGTGGGGGGA GCCGCGGGGA 2520 TCGCCGAGGG CCGGTCGGCCGCCCCGGGTG CCCCGCGGTG CCGCCGGCGG CGGTGAGGCC 2580 CCGCGCGTGT GTCCCGGCTGCGGTCGGCCG CGCTCGAGGG GTCCCCGTGG CGTCCCCTTC 2640 CCCGCCGGCC GCCTTTCTCGCGCCTTCCCC GTCGCCCCGG CCTCGCCCGT GGTCTCTCGT 2700 CTTCTCCCGG CCCGCTCTTCCGAACCGGGT CGGCGCGTCC CCCGGGTGCG CCTCGCTTCC 2760 CGGGCCTGCC GCGGCCCTTCCCCGAGGCGT CCGTCCCGGG CGTCGGCGTC GGGGAGAGCC 2820 CGTCCTCCCC GCGTGGCGTCGCCCCGTTCG GCGCGCGCGT GCGCCCGAGC GCGGCCCGGT 2880 GGTCCCTCCC GGACAGGCGTTCGTGCGACG TGTGGCGTGG GTCGACCTCC GCCTTGCCGG 2940 TCGCTCGCCC TCTCCCCGGGTCGGGGGGTG GGGCCCGGGC CGGGGCCTCG GCCCCGGTCG 3000 CTGCCTCCCG TCCCGGGCGGGGGCGGGCGC GCCGGCCGGC CTCGGTCGCC CTCCCTTGGC 3060 CGTCGTGTGG CGTGTGCCACCCCTGCGCCG GCGCCCGCCG GCGGGGCTCG GAGCCGGGCT 3120 TCGGCCGGGC CCCGGGCCCTCGACCGGACC GGCTGCGCGG GCGCTGCGGC CGCACGGCGC 3180 GACTGTCCCC GGGCCGGGCACCGCGGTCCG CCTCTCGCTC GCCGCCCGGA CGTCGGGGCC 3240 GCCCCGCGGG GCGGGCGGAGCGCCGTCCCC GCCTCGCCGC CGCCCGCGGG CGCCGGCCGC 3300 GCGCGCGCGC GCGTGGCCGCCGGTCCCTCC CGGCCGCCGG GCGCGGGTCG GGCCGTCCGC 3360 CTCCTCGCGG GCGGGCGCGACGAAGAAGCG TCGCGGGTCT GTGGCGCGGG GCCCCCGGTG 3420 GTCGTGTCGC GTGGGGGGCGGGTGGTTGGG GCGTCCGGTT CGCCGCGCCC CGCCCCGGCC 3480 CCACCGGTCC CGGCCGCCGCCCCCGCGCCC GCTCGCTCCC TCCCGTCCGC CCGTCCGCGG 3540 CCCGTCCGTC CGTCCGTCCGTCGTCCTCCT CGCTTGCGGG GCGCCGGGCC CGTCCTCGCG 3600 AGGCCCCCCG GCCGGCCGTCCGGCCGCGTC GGGGGCTCGC CGCGCTCTAC CTTACCTACC 3660 TGGTTGATCC TGCCAGTAGCATATGCTTGT CTCAAAGATT AAGCCATGCA TGTCTAAGTA 3720 CGCACGGCCG GTACAGTGAAACTGCGAATG GCTCATTAAA TCAGTTATGG TTCCTTTGGT 3780 CGCTCGCTCC TCTCCTACTTGGATAACTGT GGTAATTCTA GAGCTAATAC ATGCCGACGG 3840 GCGCTGACCC CCTTCGCGGGGGGGATGCGT GCATTTATCA GATCAAAACC AACCCGGTCA 3900 GCCCCTCTCC GGCCCCGGCCGGGGGGCGGG CGCCGGCGGC TTTGGTGACT CTAGATAACC 3960 TCGGGCCGAT CGCACGCCCCCCGTGGCGGC GACGACCCAT TCGAACGTCT GCCCTATCAA 4020 CTTTCGATGG TAGTCGCCGTGCCTACCATG GTGACCACGG GTGACGGGGA ATCAGGGTTC 4080 GATTCCGGAG AGGGAGCCTGAGAAACGGCT ACCACATCCA AGGAAGGCAG CAGGCGCGCA 4140 AATTACCCAC TCCCGACCCGGGGAGGTAGT GACGAAAAAT AACAATACAG GACTCTTTCG 4200 AGGCCCTGTA ATTGGAATGAGTCCACTTTA AATCCTTTAA CGAGGATCCA TTGGAGGGCA 4260 AGTCTGGTGC CAGCAGCCGCGGTAATTCCA GCTCCAATAG CGTATATTAA AGTTGCTGCA 4320 GTTAAAAAGC TCGTAGTTGGATCTTGGGAG CGGGCGGGCG GTCCGCCGCG AGGCGAGCCA 4380 CCGCCCGTCC CCGCCCCTTGCCTCTCGGCG CCCCCTCGAT GCTCTTAGCT GAGTGTCCCG 4440 CGGGGCCCGA AGCGTTTACTTTGAAAAAAT TAGAGTGTTC AAAGCAGGCC CGAGCCGCCT 4500 GGATACCGCA GCTAGGAATAATGGAATAGG ACCGCGGTTC TATTTTGTTG GTTTTCGGAA 4560 CTGAGGCCAT GATTAAGAGGGACGGCCGGG GGCATTCGTA TTGCGCCGCT AGAGGTGAAA 4620 TTCTTGGACC GGCGCAAGACGGACCAGAGC GAAAGCATTT GCCAAGAATG TTTTCATTAA 4680 TCAAGAACGA AAGTCGGAGGTTCGAAGACG ATCAGATACC GTCGTAGTTC CGACCATAAA 4740 CGATGCCGAC CGGCGATGCGGCGGCGTTAT TCCCATGACC CGCCGGGCAG CTTCCGGGAA 4800 ACCAAAGTCT TTGGGTTCCGGGGGGAGTAT GGTTGCAAAG CTGAAACTTA AAGGAATTGA 4860 CGGAAGGGCA CCACCAGGAGTGGAGCCTGC GGCTTAATTT GACTCAACAC GGGAAACCTC 4920 ACCCGGCCCG GACACGGACAGGATTGACAG ATTGATAGCT CTTTCTCGAT TCCGTGGGTG 4980 GTGGTGCATG GCCGTTCTTAGTTGGTGGAG CGATTTGTCT GGTTAATTCC GATAACGAAC 5040 GAGACTCTGG CATGCTAACTAGTTACGCGA CCCCCGAGCG GTCGGCGTCC CCCAACTTCT 5100 TAGAGGGACA AGTGGCGTTCAGCCACCCGA GATTGAGCAA TAACAGGTCT GTGATGCCCT 5160 TAGATGTCCG GGGCTGCACGCGCGCTACAC TGACTGGCTC AGCGTGTGCC TACCCTACGC 5220 CGGCAGGCGC GGGTAACCCGTTGAACCCCA TTCGTGATGG GGATCGGGGA TTGCAATTAT 5280 TCCCCATGAA CGAGGGAATTCCCGAGTAAG TGCGGGTCAT AAGCTTGCGT TGATTAAGTC 5340 CCTGCCCTTT GTACACACCGCCCGTCGCTA CTACCGATTG GATGGTTTAG TGAGGCCCTC 5400 GGATCGGCCC CGCCGGGGTCGGCCCACGGC CCTGGCGGAG CGCTGAGAAG ACGGTCGAAC 5460 TTGACTATCT AGAGGAAGTAAAAGTCGTAA CAAGGTTTCC GTAGGTGAAC CTGCGGAAGG 5520 ATCATTAACG GAGCCCGGAGGGCGAGGCCC GCGGCGGCGC CGCCGCCGCC GCGCGCTTCC 5580 CTCCGCACAC CCACCCCCCCACCGCGACGC GGCGCGTGCG CGGGCGGGGC CCGCGTGCCC 5640 GTTCGTTCGC TCGCTCGTTCGTTCGCCGCC CGGCCCCGCC GCCGCGAGAG CCGAGAACTC 5700 GGGAGGGAGA CGGGGGGGAGAGAGAGAGAG AGAGAGAGAG AGAGAGAGAG AGAGAGAGAA 5760 AGAAGGGCGT GTCGTTGGTGTGCGCGTGTC GTGGGGCCGG CGGGCGGCGG GGAGCGGTCC 5820 CCGGCCGCGG CCCCGACGACGTGGGTGTCG GCGGGCGCGG GGGCGGTTCT CGGCGGCGTC 5880 GCGGCGGGTC TGGGGGGGTCTCGGTGCCCT CCTCCCCGCC GGGGCCCGTC GTCCGGCCCC 5940 GCCGCGCCGG CTCCCCGTCTTCGGGGCCGG CCGGATTCCC GTCGCCTCCG CCGCGCCGCT 6000 CCGCGCCGCC GGGCACGGCCCCGCTCGCTC TCCCCGGCCT TCCCGCTAGG GCGTCTCGAG 6060 GGTCGGGGGC CGGACGCCGGTCCCCTCCCC CGCCTCCTCG TCCGCCCCCC CGCCGTCCAG 6120 GTACCTAGCG CGTTCCGGCGCGGAGGTTTA AAGACCCCTT GGGGGGATCG CCCGTCCGCC 6180 CGTGGGTCGG GGGCGGTGGTGGGCCCGCGG GGGAGTCCCG TCGGGAGGGG CCCGGCCCCT 6240 CCCGCGCCTC CACCGCGGACTCCGCTCCCC GGCCGGGGCC GCGCCGCCGC CGCCGCCGCG 6300 GCGGCCGTCG GGTGGGGGCTTTACCCGGCG GCCGTCGCGC GCCTGCCGCG CGTGTGGCGT 6360 GCGCCCCGCG CCGTGGGGGCGGGAACCCCC GGGCGCCTGT GGGGTGGTGT CCGCGCTCGC 6420 CCCCGCGTGG GCGGCGCGCGCCTCCCCGTG GTGTGAAACC TTCCGACCCC TCTCCGGAGT 6480 CCGGTCCCGT TTGCTGTCTCGTCTGGCCGG CCTGAGGCAA CCCCCTCTCC TCTTGGGCGG 6540 GGGGGGCGGG GGGACGTGCCGCGCCAGGAA GGGCCTCCTC CCGGTGCGTC GTCGGGAGCG 6600 CCCTCGCCAA ATCGACCTCGTACGACTCTT AGCGGTGGAT CACTCGGCTC GTGCGTCGAT 6660 GAAGAACGCA GCTAGCTGCGAGAATTAATG TGAATTGCAG GACACATTGA TCATCGACAC 6720 TTCGAACGCA CTTGCGGCCCCGGGTTCCTC CCGGGGCTAC GCCTGTCTGA GCGTCGCTTG 6780 CCGATCAATC GCCCCGGGGGTGCCTCCGGG CTCCTCGGGG TGCGCGGCTG GGGGTTCCCT 6840 CGCAGGGCCC GCCGGGGGCCCTCCGTCCCC CTAAGCGCAG ACCCGGCGGC GTCCGCCCTC 6900 CTCTTGCCGC CGCGCCCGCCCCTTCCCCCT CCCCCCGCGG GCCCTGCGTG GTCACGCGTC 6960 GGGTGGCGGG GGGGAGAGGGGGGCGCGCCC GGCTGAGAGA GACGGGGAGG GCGGCGCCGC 7020 CGCCGGAAGA CGGAGAGGGAAAGAGAGAGC CGGCTCGGGC CGAGTTCCCG TGGCCGCCGC 7080 CTGCGGTCCG GGTTCCTCCCTCGGGGGGCT CCCTCGCGCC GCGCGCGGCT CGGGGTTCGG 7140 GGTTCGTCGG CCCCGGCCGGGTGGAAGGTC CCGTGCCCGT CGTCGTCGTC GTCGCGCGTC 7200 GTCGGCGGTG GGGGCGTGTTGCGTGCGGTG TGGTGGTGGG GGAGGAGGAA GGCGGGTCCG 7260 GAAGGGGAAG GGTGCCGGCGGGGAGAGAGG GTCGGGGGAG CGCGTCCCGG TCGCCGCGGT 7320 TCCGCCGCCC GCCCCCGGTGGCGGCCCGGC GTCCGGCCGA CCGGCCGCTC CCCGCGCCCC 7380 TCCTCCTCCC CGCCGCCCCTCCTCCGAGGC CCCGCCCGTC CTCCTCGCCC TCCCCGCGCG 7440 TACGCGCGCG CGCCCGCCCGCCCGGCTCGC CTCGCGGCGC GTCGGCCGGG GCCGGGAGCC 7500 CGCCCCGCCG CCCGCCCGTGGCCGCGGCGC CGGGGTTCGC GTGTCCCCGG CGGCGACCCG 7560 CGGGACGCCG CGGTGTCGTCCGCCGTCGCG CGCCCGCCTC CGGCTCGCGG CCGCGCCGCG 7620 CCGCGCCGGG GCCCCGTCCCGAGCTTCCGC GTCGGGGCGG CGCGGCTCCG CCGCCGCGTC 7680 CTCGGACCCG TCCCCCCGACCTCCGCGGGG GAGACGCGCC GGGGCGTGCG GCGCCCGTCC 7740 CGCCCCCGGC CCGTGCCCCTCCCTCCGGTC GTCCCGCTCC GGCGGGGCGG CGCGGGGGCG 7800 CCGTCGGCCG CGCGCTCTCTCTCCCGTCGC CTCTCCCCCT CGCCGGGCCC GTCTCCCGAC 7860 GGAGCGTCGG GCGGGCGGTCGGGCCGGCGC GATTCCGTCC GTCCGTCCGC CGAGCGGCCC 7920 GTCCCCCTCC GAGACGCGACCTCAGATCAG ACGTGGCGAC CCGCTGAATT TAAGCATATT 7980 AGTCAGCGGA GGAAAAGAAACTAACCAGGA TTCCCTCAGT AACGGCGAGT GAACAGGGAA 8040 GAGCCCAGCG CCGAATCCCCGCCCCGCGGG GCGCGGGACA TGTGGCGTAC GGAAGACCCG 8100 CTCCCCGGCG CCGCTCGTGGGGGGCCCAAG TCCTTCTGAT CGAGGCCCAG CCCGTGGACG 8160 GTGTGAGGCC GGTAGCGGCCGGCGCGCGCC CGGGTCTTCC CGGAGTCGGG TTGCTTGGGA 8220 ATGCAGCCCA AAGCGGGTGGTAAACTCCAT CTAAGGCTAA ATACCGGCAC GAGACCGATA 8280 GTCAACAAGT ACCGTAAGGGAAAGTTGAAA AGAACTTTGA AGAGAGAGTT CAAGAGGGCG 8340 TGAAACCGTT AAGAGGTAAACGGGTGGGGT CCGCGCAGTC CGCCCGGAGG ATTCAACCCG 8400 GCGGCGGGTC CGGCCGTGTCGGCGGCCCGG CGGATCTTTC CCGCCCCCCG TTCCTCCCGA 8460 CCCCTCCACC CGCCCTCCCTTCCCCCGCCG CCCCTCCTCC TCCTCCCCGG AGGGGGCGGG 8520 CTCCGGCGGG TGCGGGGGTGGGCGGGCGGG GCCGGGGGTG GGGTCGGCGG GGGACCGTCC 8580 CCCGACCGGC GACCGGCCGCCGCCGGGCGC ATTTCCACCG CGGCGGTGCG CCGCGACCGG 8640 CTCCGGGACG GCTGGGAAGGCCCGGCGGGG AAGGTGGCTC GGGGGGCCCC GTCCGTCCGT 8700 CCGTCCTCCT CCTCCCCCGTCTCCGCCCCC CGGCCCCGCG TCCTCCCTCG GGAGGGCGCG 8760 CGGGTCGGGG CGGCGGCGGCGGCGGCGGTG GCGGCGGCGG CGGGGGCGGC GGGACCGAAA 8820 CCCCCCCCGA GTGTTACAGCCCCCCCGGCA GCAGCACTCG CCGAATCCCG GGGCCGAGGG 8880 AGCGAGACCC GTCGCCGCGCTCTCCCCCCT CCCGGCGCCC ACCCCCGCGG GGAATCCCCC 8940 GCGAGGGGGG TCTCCCCCGCGGGGGCGCGC CGGCGTCTCC TCGTGGGGGG GCCGGGCCAC 9000 CCCTCCCACG GCGCGACCGCTCTCCCACCC CTCCTCCCCG CGCCCCCGCC CCGGCGACGG 9060 GGGGGGTGCC GCGCGCGGGTCGGGGGGCGG GGCGGACTGT CCCCAGTGCG CCCCGGGCGG 9120 GTCGCGCCGT CGGGCCCGGGGGAGGTTCTC TCGGGGCCAC GCGCGCGTCC CCCGAAGACG 9180 GGGACGGCGG AGCGAGCGCACGGGGTCGGC GGCGACGTCG GCTACCCACC CGACCCGTCT 9240 TGAAACACGG ACCAAGGAGTCTAACACGTG CGCGAGTCGG GGGCTCGCAC GAAAGCCGCC 9300 GTGGCGCAAT GAAGGTGAAGGCCGGCGCGC TCGCCGGCCG AGGTGGGATC CCGAGGCCTC 9360 TCCAGTCCGC CGAGGGCGCACCACCGGCCC GTCTCGCCCG CCGCGCCGGG GAGGTGGAGC 9420 ACGAGCGCAC GTGTTAGGACCCGAAAGATG GTGAACTATG CCTGGGCAGG GCGAAGCCAG 9480 AGGAAACTCT GGTGGAGGTCCGTAGCGGTC CTGACGTGCA AATCGGTCGT CCGACCTGGG 9540 TATAGGGGCG AAAGACTAATCGAACCATCT AGTAGCTGGT TCCCTCCGAA GTTTCCCTCA 9600 GGATAGCTGG CGCTCTCGCAGACCCGACGC ACCCCCGCCA CGCAGTTTTA TCCGGTAAAG 9660 CGAATGATTA GAGGTCTTGGGGCCGAAACG ATCTCAACCT ATTCTCAAAC TTTAAATGGG 9720 TAAGAAGCCC GGCTCGCTGGCGTGGAGCCG GGCGTGGAAT GCGAGTGCCT AGTGGGCCAC 9780 TTTTGGTAAG CAGAACTGGCGCTGCGGGAT GAACCGAACG CCGGGTTAAG GCGCCCGATG 9840 CCGACGCTCA TCAGACCCCAGAAAAGGTGT TGGTTGATAT AGACAGCAGG ACGGTGGCCA 9900 TGGAAGTCGG AATCCGCTAAGGAGTGTGTA ACAACTCACC TGCCGAATCA ACTAGCCCTG 9960 AAAATGGATG GCGCTGGAGCGTCGGGCCCA TACCCGGCCG TCGCCGGCAG TCGAGAGTGG 10020 ACGGGAGCGG CGGGGGCGGCGCGCGCGCGC GCGCGTGTGG TGTGCGTCGG AGGGCGGCGG 10080 CGGCGGCGGC GGCGGGGGTGTGGGGTCCTT CCCCCGCCCC CCCCCCCACG CCTCCTCCCC 10140 TCCTCCCGCC CACGCCCCGCTCCCCGCCCC CGGAGCCCCG CGGACGCTAC GCCGCGACGA 10200 GTAGGAGGGC CGCTGCGGTGAGCCTTGAAG CCTAGGGCGC GGGCCCGGGT GGAGCCGCCG 10260 CAGGTGCAGA TCTTGGTGGTAGTAGCAAAT ATTCAAACGA GAACTTTGAA GGCCGAAGTG 10320 GAGAAGGGTT CCATGTGAACAGCAGTTGAA CATGGGTCAG TCGGTCCTGA GAGATGGGCG 10380 AGCGCCGTTC CGAAGGGACGGGCGATGGCC TCCGTTGCCC TCGGCCGATC GAAAGGGAGT 10440 CGGGTTCAGA TCCCCGAATCCGGAGTGGCG GAGATGGGCG CCGCGAGGCG TCCAGTGCGG 10500 TAACGCGACC GATCCCGGAGAAGCCGGCGG GAGCCCCGGG GAGAGTTCTC TTTTCTTTGT 10560 GAAGGGCAGG GCGCCCTGGAATGGGTTCGC CCCGAGAGAG GGGCCCGTGC CTTGGAAAGC 10620 GTCGCGGTTC CGGCGGCGTCCGGTGAGCTC TCGCTGGCCC TTGAAAATCC GGGGGAGAGG 10680 GTGTAAATCT CGCGCCGGGCCGTACCCATA TCCGCAGCAG GTCTCCAAGG TGAACAGCCT 10740 CTGGCATGTT GGAACAATGTAGGTAAGGGA AGTCGGCAAG CCGGATCCGT AACTTCGGGA 10800 TAAGGATTGG CTCTAAGGGCTGGGTCGGTC GGGCTGGGGC GCGAAGCGGG GCTGGGCGCG 10860 CGCCGCGGCT GGACGAGGCGCGCGCCCCCC CCACGCCCGG GGCACCCCCC TCGCGGCCCT 10920 CCCCCGCCCC ACCCGCGCGCGCCGCTCGCT CCCTCCCCAC CCCGCGCCCT CTCTCTCTCT 10980 CTCTCCCCCG CTCCCCGTCCTCCCCCCTCC CCGGGGGAGC GCCGCGTGGG GGCGCGGCGG 11040 GGGGAGAAGG GTCGGGGCGGCAGGGGCCGC GCGGCGGCCG CCGGGGCGGC CGGCGGGGGC 11100 AGGTCCCCGC GAGGGGGGCCCCGGGGACCC GGGGGGCCGG CGGCGGCGCG GACTCTGGAC 11160 GCGAGCCGGG CCCTTCCCGTGGATCGCCCC AGCTGCGGCG GGCGTCGCGG CCGCCCCCGG 11220 GGAGCCCGGC GGCGGCGCGGCGCGCCCCCC ACCCCCACCC CACGTCTCGG TCGCGCGCGC 11280 GTCCGCTGGG GGCGGGAGCGGTCGGGCGGC GGCGGTCGGC GGGCGGCGGG GCGGGGCGGT 11340 TCGTCCCCCC GCCCTACCCCCCCGGCCCCG TCCGCCCCCC GTTCCCCCCT CCTCCTCGGC 11400 GCGCGGCGGC GGCGGCGGCAGGCGGCGGAG GGGCCGCGGG CCGGTCCCCC CCGCCGGGTC 11460 CGCCCCCGGG GCCGCGGTTCCGCGCGCGCC TCGCCTCGGC CGGCGCCTAG CAGCCGACTT 11520 AGAACTGGTG CGGACCAGGGGAATCCGACT GTTTAATTAA AACAAAGCAT CGCGAAGGCC 11580 CGCGGCGGGT GTTGACGCGATGTGATTTCT GCCCAGTGCT CTGAATGTCA AAGTGAAGAA 11640 ATTCAATGAA GCGCGGGTAAACGGCGGGAG TAACTATGAC TCTCTTAAGG TAGCCAAATG 11700 CCTCGTCATC TAATTAGTGACGCGCATGAA TGGATGAACG AGATTCCCAC TGTCCCTACC 11760 TACTATCCAG CGAAACCACAGCCAAGGGAA CGGGCTTGGC GGAATCAGCG GGGAAAGAAG 11820 ACCCTGTTGA GCTTGACTCTAGTCTGGCAC GGTGAAGAGA CATGAGAGGT GTAGAATAAG 11880 TGGGAGGCCC CCGGCGCCCCCCCGGTGTCC CCGCGAGGGG CCCGGGGCGG GGTCCGCGGC 11940 CCTGCGGGCC GCCGGTGAAATACCACTACT CTGATCGTTT TTTCACTGAC CCGGTGAGGC 12000 GGGGGGGCGA GCCCGAGGGGCTCTCGCTTC TGGCGCCAAG CGCCCGCCCG GCCGGGCGCG 12060 ACCCGCTCCG GGGACAGTGCCAGGTGGGGA GTTTGACTGG GGCGGTACAC CTGTCAAACG 12120 GTAACGCAGG TGTCCTAAGGCGAGCTCAGG GAGGACAGAA ACCTCCCGTG GAGCAGAAGG 12180 GCAAAAGCTC GCTTGATCTTGATTTTCAGT ACGAATACAG ACCGTGAAAG CGGGGCCTGA 12240 CGATCCTTCT GACCTTTTGGGTTTTAAGCA GGAGGTGTCA GAAAAGTTAC CACAGGGATA 12300 ACTGGCTTGT GGCGGCCAAGCGTTCATAGC GACGTCGCTT TTTGATCCTT CGATGTCGGC 12360 TCTTCCTATC ATTGTGAAGCAGAATTCGCC AAGCGTTGGA TTGTTCACCC ACTAATAGGG 12420 AACGTGAGCT GGGTTTAGACCGTCGTGAGA CAGGTTAGTT TTACCCTACT GATGATGTGT 12480 TGTTGCCATG GTAATCCTGCTCAGTACGAG AGGAACCGCA GGTTCAGACA TTTGGTGTAT 12540 GTGCTTGGCT GAGGAGCCAATGGGGCGAAG CTACCATCTG TGGGATTATG ACTGAACGCC 12600 TCTAAGTCAG AATCCCGCCCAGGCGAACGA TACGGCAGCG CCGCGGAGCC TCGGTTGGCC 12660 TCGGATAGCC GGTCCCCCGCCTGTCCCCGC CGGCGGGCCG CCCCCCCCTC CACGCGCCCC 12720 GCCGCGGGAG GGCGCGTGCCCCGCCGCGCG CCGGGACCGG GGTCCGGTGC GGAGTGCCCT 12780 TCGTCCTGGG AAACGGGGCGCGGCCGGAAA GGCGGCCGCC CCCTCGCCCG TCACGCACCG 12840 CACGTTCGTG GGGAACCTGGCGCTAAACCA TTCGTAGACG ACCTGCTTCT GGGTCGGGGT 12900 TTCGTACGTA GCAGAGCAGCTCCCTCGCTG CGATCTATTG AAAGTCAGCC CTCGACACAA 12960 GGGTTTGTCC GCGCGCGCGTGCGTGCGGGG GGCCCGGCGG GCGTGCGCGT TCGGCGCCGT 13020 CCGTCCTTCC GTTCGTCTTCCTCCCTCCCG GCCTCTCCCG CCGACCGCGG CGTGGTGGTG 13080 GGGTGGGGGG GAGGGCGCGCGACCCCGGTC GGCCGCCCCG CTTCTTCGGT TCCCGCCTCC 13140 TCCCCGTTCA CGCCGGGGCGGCTCGTCCGC TCCGGGCCGG GACGGGGTCC GGGGAGCGTG 13200 GTTTGGGAGC CGCGGAGGCGCCGCGCCGAG CCGGGCCCCG TGGCCCGCCG GTCCCCGTCC 13260 CGGGGGTTGG CCGCGCGGCGCGGTGGGGGG CCACCCGGGG TCCCGGCCCT CGCGCGTCCT 13320 TCCTCCTCGC TCCTCCGCACGGGTCGACCG ACGAACCGCG GGTGGCGGGC GGCGGGCGGC 13380 GAGCCCCACG GGCGTCCCCGCACCCGGCCG ACCTCCGCTC GCGACCTCTC CTCGGTCGGG 13440 CCTCCGGGGT CGACCGCCTGCGCCCGCGGG CGTGAGACTC AGCGGCGTCT CGCCGTGTCC 13500 CGGGTCGACC GCGGCCTTCTCCACCGAGCG GCGGTGTAGG AGTGCCCGTC GGGACGAACC 13560 GCAACCGGAG CGTCCCCGTCTCGGTCGGCA CCTCCGGGGT CGACCAGCTG CCGCCCGCGA 13620 GCTCCGGACT TAGCCGGCGTCTGCACGTGT CCCGGGTCGA CCAGCAGGCG GCCGCCGGAC 13680 GCAGCGGCGC ACGCACGCGAGGGCGTCGAT TCCCCTTCGC GCGCCCGCGC CTCCACCGGC 13740 CTCGGCCCGC GGTGGAGCTGGGACCACGCG GAACTCCCTC TCCCACATTT TTTTCAGCCC 13800 CACCGCGAGT TTGCGTCCGCGGGACCTTTA AGAGGGAGTC ACTGCTGCCG TCAGCCAGTA 13860 CTGCCTCCTC CTTTTTCGCTTTTAGGTTTT GCTTGCCTTT TTTTTTTTTT TTTTTTTTTT 13920 TTTTTTCTTT CTTTCTTTCTTTCTTTCTTT CTTTCTTTCT TTCTTTCTTT CGCTTGTCTT 13980 CTTCTTGTGT TCTCTTCTTGCTCTTCCTCT GTCTGTCTCT CTCTCTCTCT CTCTCTCTGT 14040 CTCTCGCTCT CGCCCTCTCTCTCTTCTCTC TCTCTCTCTC TCTCTCTCTG TCTCTCGCTC 14100 TCGCCCTCTC TCTCTCTCTTCTCTCTGTCT CTCTCTCTCT CTCTCTCTCT CTCTCTCTCC 14160 GTCGCTCTCG CCCTCTCGCTCTCTCTCTGT CTCTGTCTGT GTCTCTCTCT CTCCCTCCCT 14220 CCCTCCCTCC CTCCCTCCCTCCCTCCCCTT CCTTGGCGCC TTCTCGGCTC TTGAGACTTA 14280 GCCGCTGTCT CGCCGTACCCCGGGTCGACC GGCGGGCCTT CTCCACCGAG CGGCGTGCCA 14340 CAGTGCCCGT CGGGACGAGCCGGACCCGCC GCGTCCCCGT CTCGGTCGGC ACCTCCGGGG 14400 TCGACCAGCT GCCGCCCGCGAGCTCCGGAC TTAGCCGGCG TCTGCACGTG TCCCGGTCG 14460 ACCAGCAGGC GGCCGCCGGACGCAGCGGCG CACCGACGGA GGGCGCTGAT TCCCGTTCAC 14520 GCGCCCGCGC CTCCACCGGCCTCGGCCCGC CGTGGAGCTG GGACCACGCG GAACTCCCTC 14580 TCCTACATTT TTTTCAGCCCCACCGCGAGT TTGCGTCCGC GGGACCTTTA AGAGGGAGTC 14640 ACTGCTGCCG TCAGCCAGTACTGCCTCCTC CTTTTTCGCT TTTAGGTTTT GCTTGCCTTT 14700 TTTTTTTTTT TTTTTTTTTTTTTTTTCTTT CTTTCTTTCT TTCTTTCTTT CTTTCTTTCT 14760 TTCTTTCTTT CTTTCGCTCTCGCTCTCTCG CTCTCTCCCT CGCTCGTTTC TTTCTTTCTC 14820 TTTCTCTCTC TCTCTCTCTCTCTCTCTCTC TCTGTCTCTC GCTCTCGCCC TCTCTCTCTC 14880 TTTCTCTCTC TCTCTGTCTCTCTCTCTCTC TCTCTCTCTC TCTCTCTCTC CCTCCCTCCC 14940 TCCCCCTCCC TCCCTCTCTCCCCTTCCTTG GCGCCTTCTC GGCTCTTGAG ACTTAGCCGC 15000 TGTCTCGCCG TGTCCCGGGTCGACCGGCGG GCCTTCTCCA CCGAGCGGCG TGCCACAGTG 15060 CCCGTCGGGA CGAGCCGGACCCGCCGCGTC CCCGTCTCGG TCGGCACCTC CGGGGTCGAC 15120 CAGCTGCCGC CCGCGAGCTCCGGACTTAGC CGGCGTCTGC ACGTGTCCCG GGTCGACCAG 15180 CAGGCGGCCG CCGGACGCTGCGGCGCACCG ACGCGAGGGC GTCGATTCCG GTTCACGCGC 15240 CGGCGACCTC CACCGGCCTCGGCCCGCGGT GGAGCTGGGA CCACGCGGAA CTCCCTCTCC 15300 CACATTTTTT TCAGCCCCACCGCGAGTTTG CGTCCGCGGG ACTTTTAAGA GGGAGTCACT 15360 GCTGCCGTCA GCCAGTAATGCTTCCTCCTT TTTTGCTTTT TGGTTTTGCC TTGCGTTTTC 15420 TTTCTTTCTT TCTTTCTTTCTTTCTTTCTT TCTTTCTTTC TCTCTCTCTC TCTCTCTCTC 15480 TCTCTGTCTC TCTCTCTCTGTCTCTCTCCC CTCCCTCCCT CCTTGGTGCC TTCTCGGCTC 15540 GCTGCTGCTG CTGCCTCTGCCTCCACGGTT CAAGCAAACA GCAAGTTTTC TATTTCGAGT 15600 AAAGACGTAA TTTCACCATTTTGGCCGGGC TGGTCTCGAA CTCCCGACCT AGTGATGAGT 15660 CCGCCTCGGC CTCCCAAAGACTGCTGGGAG TACAGATGTG AGCCACCATG CCCGGCCGAT 15720 TCCTTCCTTT TTTCAATCTTATTTTCTGAA CGCTGCCGTG TATGAACATA CATCTACACA 15780 CACACACACA CACACACACACACACACACA CACACACACA CACACACCCC GTAGTGATAA 15840 AACTATGTAA ATGATATTTCCATAATTAAT ACGTTTATAT TATGTTACTT TTAATGGATG 15900 AATATGTATC GAAGCCCCATTTCATTTACA TACACGTGTA TGTATATCCT TCCTCCCTTC 15960 CTTCATTCAT TATTTATTAATAATTTTCGT TTATTTATTT TCTTTTCTTT TGGGGCCGGC 16020 CCGCCTGGTC TTCTGTCTCTGCGCTCTGGT GACCTCAGCC TCCCAAATAG CTGGGACTAC 16080 AGGGATCTCT TAAGCCCGGGAGGAGAGGTT AACGTGGGCT GTGATCGCAC ACTTCCACTC 16140 CAGCTTACGT GGGCTGCGGTGCGGTGGGGT GGGGTGGGGT GGGGTGGGGT GCAGAGAAAA 16200 CGATTGATTG CGATCTCAATTGCCTTTTAG CTTCATTCAT ACCCTGTTAT TTGCTCGTTT 16260 ATTCTCATGG GTTCTTCTGTGTCATTGTCA CGTTCATCGT TTGCTTGCCT GCTTGCCTGT 16320 TTATTTCCTT CCTTCCTTCCTTCCTTCCTT CCTTCCTTCC TTCCTTCCTT CCCTCCCTTA 16380 CTGGCAGGGT CTTCCTCTGTCTCTGCCGCC CAGGATCACC CCAACCTCAA CGCTTTGGAC 16440 CGACCAAACG GTCGTTCTGCCTCTGATCCC TCCCATCCCC ATTACCTGAG ACTACAGGCG 16500 CGCACCACCA CACCGGCTGACTTTTATGTT GTTTCTCATG TTTTCCGTAG GTAGGTATGT 16560 GTGTGTGTGT GTGTGTGTGTGTGTGTGTGT GTGTGTGTGT GTGTGTGTGT GTGTGTACTC 16620 ATGTATGTAC GTATGTATGTATGTATGTGA GTGAGATGGG TTTCGGGGTT CTATCATGTT 16680 GCCCACGCTG GTCTCGAACTCCTGTCCTCA AGCAATCCGC CTGCCTGCCT CGGCCGCCCA 16740 CACTGCTGCT ATTACAGGCGTGAGACGCTG CGCCTGGCTC CTTCTACATT TGCCTGCCTG 16800 CCTGCCTGCC TGCCTGCCTATCAATCGTCT TCTTTTTAGT ACGGATGTCG TCTCGCTTTA 16860 TTGTCCATGC TCTGGGCACACGTGGTCTCT TTTCAAACTT CTATGATTAT TATTATTGTA 16920 GGCGTCATCT CACGTGTCGAGGTGATCTCG AACTTTTAGG CTCCAGAGAT CCTCCCGCAT 16980 CGGCCTCCCG GAGTGCTGTGATGACACGCG TGGGCACGGT ACGCTCTGGT CGTGTTTGTC 17040 GTGGGTCGGT TCTTTCCGTTTTTAATACGG GGACTGCGAA CGAAGAAAAT TTTCAGACGC 17100 ATCTCACCGA TCCGCCTTTTCGTTCTTTCT TTTTATTCTC TTTAGACGGA GTTTCACTCT 17160 TGTCGCCCAG GGTGGAGTACGATGGCGGCT CTCGGCTCAC CGCACCCTCC GCCTCCCAGG 17220 TTCAAGTGAT TCTCCTGCCTCAGCCTTCCC GAGTAGCTGG AATGACAGAG ATGAGCCATC 17280 GTGCCCGGCT AATTTTTCTATTTTTAGTAC AGATGGGGTT TCTCCATCTT GGTCAGGCTC 17340 GTCTTCAACT TCCGACCGTTGGAGAATCTT AACTTTCTTG GTGGTGGTTG TTTTCCTTTT 17400 TCTTTTTTTT TCTTTTCTTTTCTTTCCTTC TCCTCCCCCC CCCACCCCCC TTGTCGTCGT 17460 CCTCCTCCTC CTCCTCCTCCTCCTCCTCCT CCTCCTCCTC CTCCTCCTCC TCTTTCATTT 17520 CTTTCAGCTG GGCTCTCCTACTTGTGTTGC TCTGTTGCTC ACGCTGGTCT CAAACTCCTG 17580 GCCTTGACTC TTCTCCCGTCACATCCGCCG TCTGGTTGTT GAAATGAGCA TCTCTCGTAA 17640 AATGGAAAAG ATGAAAGAAATAAACACGAA GACGGAAAGC ACGGTGTGAA CGTTTCTCTT 17700 GCCGTCTCCC GGGGTGTACCTTGGACCCGG AAACACGGAG GGAGCTTGGC TGAGTGGGTT 17760 TTCGGTGCCG AAACCTCCCGAGGGCCTCCT TCCCTCTCCC CCTTGTCCCC GCTTCTCCGC 17820 CAGCCGAGGC TCCCACCGCCGCCCCTGGCA TTTTCCATAG GAGAGGTATG GGAGAGGAGT 17880 GACACGCCTT CCAGATCTATATCCTGCCGG ACGTCTCTGG CTCGGCGTGC CCCACCGGCT 17940 ACCTGCCACC TTCCAGGGAGCTCTGAGGCG GATGCGACCC CCACCCCCCC GTCACGTCCC 18000 GCTACCCTCC CCCGGCTGGCCTTTGCCGGG CGACCCCAGG GGAACCGCGT TGATGCTGCT 18060 TCGGATCCTC CGGCGAAGACTTCCACCGGA TGCCCCGGGT GGGCCGGTTG GGATCAGACT 18120 GGACCACCCC GGACCGTGCTGTTCTTGGGG GTGGGTTGAC GTACAGGGTG GACTGGCAGC 18180 CCCAGCATTG TAAAGGGTGCGTGGGTATGG AAATGTCACC TAGGATGCCC TCCTTCCCTT 18240 CGGTCTGCCT TCAGCTGCCTCAGGCGTGAA GACAACTTCC CATCGGAACC TCTTCTCTTC 18300 CCTTTCTCCA GCACACAGATGAGACGCACG AGAGGGAGAA ACAGCTCAAT AGATACCGCT 18360 GACCTTCATT TGTGGAATCCTCAGTCATCG ACACACAAGA CAGGTGACTA GGCAGGGACA 18420 CAGATCAAAC ACTATTTCCGGGTCCTCGTG GTGGGATTGG TCTCTCTCTC TCTCTCTCTC 18480 TCTCTCTCTC TCTCTCTCTCTCTCGCACGC GCACGCGCGC ACACACACAC ACAATTTCCA 18540 TATCTAGTTC ACAGAGCACACTCACTTCCC CTTTTCACAG TACGCAGGCT GAGTAAAACG 18600 CGCCCCACCC TCCACCCGTTGGCTGACGAA ACCCCTTCTC TACAATTGAT GAAAAAGATG 18660 ATCTGGGCCG GGCACGCTAGCTCACGCCTG TCACTCCGGC ACTTTGGGAG GCCGAGGCGG 18720 GTGGATCGCT TGGGGCCGGGAGTTCGAGAC CAGGCTGGCC GACGTGGCGA AACCCCGTCT 18780 CTCTGAAAAA TAGAACGATTAGCCGGGCCT GGTGGCGTGG GCTTGGAATC ACGACCGCTC 18840 GGGAGACTGG GGCGGGCGACTTGTTCCAAC CGGGGAGGCC GAGGCCGCGA TGAGCTGAGA 18900 TCGTGCCGTG GCGATGCGGCCTGGATGACG GAGCGAGACC CCGTCTCGAG AGAATCATGA 18960 TGTTATTATA AGATGAGTTGTGCGCGGTGA TGGCCGCCTG TAGTCGCGGC TACTCGGGAG 19020 GCTGAGACGA GGAGAAGATCACTTGAGGCC CCACAGGTCG AGGCTTCGGT CGGCCGTGAC 19080 CCACTGTATC CTGGGCAGTCACCGGTCAAG GAGATATGCC CCTTCCCCGT TTGCTTTTCT 19140 TTTCTTCCCT TCTCTTTTCTTCTTTTTGCT TCTCTTTTCT TTCTTTCTTT CTTTCTTTCT 19200 TTCTTTCTTT CTTTCTTTCTTTTTCTTTTT CTCTCTTCCC CTCTTTCTTT CCTGCCTTCC 19260 TGCCTTTCTT CTTTTCTTCTTTCCTCCCTT CCTCCCTTCC TTCTTTCCTC CCGCCTCAGC 19320 CTCCCAAAGT GCTGGGATGACTGGCGGGAG GCACCATGCC TGCTTGGCCC AAAGAGACCC 19380 TCTTGGAAAG TGAGACGCAGAGAGCGCCTT CCAGTGATCT CATTGACTGA TTTAGAGACG 19440 GCATCTCGCT CCGTCACCCCGGCAGTGGTG CCGTCGTAAC TCACTCCCTG CAGCGTGGAC 19500 GCTCCTGGAC TCGAGCGATCCTTCCACCTC AGCCTCCAGA GTACAGAGCC TGGGACCGCG 19560 GGCACGCGCC ACTGTGCCCACACCGTTTTT AATTGTTTTT TTTTCCCCCG AGACAGAGTT 19620 TCACTCTCGT GGCCTAGACTGCAGTGCGGT GGCGCGATCT TGGCTCACCG CAACCTCTGC 19680 CTCCCGGTTT CAAGCGATTCTCCTGCATCG GCCTCCTGAG TAGCCGGGAT TGCGGGCATG 19740 CGCTGCCACG TCTGGCTGATTTCGTATTTT TAGTGGAGAC GGGGCTTCTC CATGTCGATC 19800 GGGCTGGTTT CGAACTCCCGACCTCAGGTG ATCCGCCCTC CCCGGCCTCC GGAAGTGCTG 19860 GGATGACAGG CGTGAGCCACCGCGCCCGGC CTTCATTTTT AAATGTTTTC CCACAGACGG 19920 GGTCTCATCA TTTCTTTGCAACCCTCCTGC CCGGCGTCTC AAAGTGCTGG CGTGACGGGC 19980 GTGAGCCACT GCGCCTGGACTCCGGGGAAT GACTCACGAC CACCATCGCT CTACTGATCC 20040 TTTCTTTCTT TCTTTCTTTCTTTCTTTCTT TCTTTCTTTC TTTCTTTCTT TCTTTCTTGA 20100 TGAATTATCT TATGATTTATTTGTGTACTT ATTTTCAGAC GGAGTCTCGC TCTGGGCGGG 20160 GCGAGGCGAG GCGAGGCACAGCGCATCGCT TTGGAAGCCG CGGCAACGCC TTTCAAAGCC 20220 CCATTCGTAT GCACAGAGCCTTATTCCCTT CCTGGAGTTG GAGCTGATGC CTTCCGTAGC 20280 CTTGGGCTTC TCTCCATTCGGAAGCTTGAC AGGCGCAGGG CCACCCAGAG GCTGGCTGCG 20340 GCTGAGGATT AGGGGGTGTGTTGGGGCTGA AAACTGGGTC CCCTATTTTT GATACCTCAG 20400 CCGACACATC CCCCGACCGCCATCGCTTGC TCGCCCTCTG AGATCCCCCG CCTCCACCGC 20460 CTTGCAGGCT CACCTCTTACTTTCATTTCT TCCTTTCTTG CGTTTGAGGA GGGGGTGCGG 20520 GAATGAGGGT GTGTGTGGGGAGGGGGTGCG GGGTGGGGAC GGAGGGGAGC GTCCTAAGGG 20580 TCGATTTAGT GTCATGCCTCTTTCACCACC ACCACCACCA CCGAAGATGA CAGCAAGGAT 20640 CGGCTAAATA CCGCGTGTTCTCATCTAGAA GTGGGAACTT ACAGATGACA GTTCTTGCAT 20700 GGGCAGAACG AGGGGGACCGGGGACGCGGA AGTCTGCTTG AGGGAGGAGG GGTGGAAGGA 20760 GAGACAGCTT CAGGAAGAAAACAAAACACG AATACTGTCG GACACAGCAC TGACTACCCG 20820 GGTGATGAAA TCATCTGCACACTGAACACC CCCGTCACAA GTTTACCTAT GTCACAATCT 20880 TGCACATGTA TCGCTTGAACGACAAATAAA AGTTAGGGGG GAGAAGAGAG GAGAGAGAGA 20940 GAGAGAGAGA GACAGAGAGAGACAGAGAGA GAGAGAGAGG AGGGAGAGAG GAAAACGAAA 21000 CACCACCTCC TTGACCTGAGTCAGGGGGTT TCTGGCCTTT TGGGAGAACG TTCAGCGACA 21060 ATGCAGTATT TGGGCCCGTTCTTTTTTTTT CTTCTTCTTT TCTTTCTTTT TTTTTGGACT 21120 GAGTCTCTCT CGCTCTGTCACCCAGGCTGC GGTCGCGGTG GCGCTCTCTC GGCTCACTGA 21180 AACCTCTGCT TCCCGGGTTCCAGTGATTCT TCTTCGGTAG CTGGGATTAC AGGCGCACAC 21240 CATGACGGCG GGCTCATATTCCTATTTTCA GTAGAGACGG GGTTTCTCCA CGTTGGCCAC 21300 GCTGGTCTCG AACTCCTGACCTCAAATGAT CCGCCTTCCT GGGCCTCCCA AAGTGCTGGA 21360 AACGACAGGC CTGAGCCGCCGGGATTTCAG CCTTTAAAAG CGCGGCCCTG CCACCTTTCG 21420 CTGTGGCCCT TACGCTCAGAATGACGTGTC CTCTCTGCCG TAGGTTGACT CCTTGAGTCC 21480 CCTAGGCCAT TGCACTGTAGCCTGGGCAGC AAGAGCCAAA CTCCGNNCCC CCACCTCCTC 21540 GCGCACATAA TAACTAACTAACAAACTAAC TAACTAACTA AACTAACTAA CTAACTAAAA 21600 TCTCTACACG TCACCCATAAGTGTGTGTTC CCGTGAGAGT GATTTCTAAG AAATGGTACT 21660 GTACACTGAA CGCAGTGGCTCACGTCTGTC ATCCCGAGGT CAGGAGTTCG AGACCAGCCC 21720 GGCCAACGTG GTGAAACCCCGTCTCTACTG AAAATACGAA ATGGAGTCAG GCGCCGTGGG 21780 GCAGGCACCT GTAACCCCAGCTACTCGGGA GGCTGGGGTG GAAGAATTGC TTGAACCTGG 21840 CAGGCGGAGG CTGCAGTGACCCAAGATCGC ACCACTGCAC TACAGCCTGG GCGACAGAGT 21900 GAGACCCGGT CTCCAGATAAATACGTACAT AAATAAATAC ACACATACAT ACATATATAC 21960 ATACATACAT ACATACATACATCCATGCAT ACAGATATAC AAGAAAGAAA AAAAGAAAAG 22020 AAAAGAAAGA GAAAATGAAAGAAAAGGCAC TGTATTGCTA CTGGGCTAGG GCCTTCTCTC 22080 TGTCTGTTTC TCTCTGTTCGTCTCTGTCTT TCTCTCTGTG TCTCTTTCTC TGTCTGTCTG 22140 TCTCTTTCTT TCTCTCTGTCTCTGTCTCTG TCTTTGTCTC TCTCTCTCCC TCTCTGCCTG 22200 TCTCACTGTG TCTGTCTTCTGTCTTACTCT CTTTCTCTCC CCGTCTGTCT CTCTCTCTCT 22260 CTCTCCCTCC CTGTTTGTTTCTCTCTCTCC CTCCCTGTCT GTTTCTCTCT CTCTCTTTCT 22320 GTCTGTTTCT GTCTCTCTCTGTCTGTCTAT GTCTTTCTCT GTCTGTCTCT TTCTCTGTCT 22380 GTCTGCCTCT CTCTTTCTTTTTCTGTGTCT CTCTGTCGGT CTCTCTCTCT CTGTCTGTCT 22440 GTCTGTCTCT CTCTCTCTCTCTCTGTGCCT ATCTTCTGTC TTACTCTCTT TCTCTCCTGG 22500 TCTGTCTGTC TCTCCCTCCCTTTCTGTTTC TCTCTCTCTC TCTCTCTCTC TCCCCCTCTC 22560 CCTGTCTGTT TCTCTCCGTCTCTCTCTCTT TCTGTCTGTT TCTCACTGTC TCTCTCTGTC 22620 CATCTCTCTC TCTCTCTGTCTGTCTCTTTC GTTCTCTCTG TCTGTCTGTC TCTCTCTCTC 22680 TCTCTCTCTC TCTCTCTCTCTCCCTGTCTG TCTGTTTCTC TCTATCTCTC GCTGTCCATC 22740 TCTGTCTTTC TATGTCTGTCTCTTTCTCTG TCAGTCTGTC AGACACCCCC GTGCCGGGTA 22800 GGGCCCTGCC CCTTCCACGAAAGTGAGAAG CGCGTGCTTC GGTGCTTAGA GAGGCCGAGA 22860 GGAATCTAGA CAGGCGGGCCTTGCTGGGCT TCCCCACTCG GTGTATGATT TCGGGAGGTC 22920 GAGGCCGGGT CCCCGCTTGGATGCGAGGGG CATTTTCAGA CTTTTCTCTC GGTCACGTGT 22980 GGCGTCCGTA CTTCTCCTATTTCCCCGATA AGCTCCTCGA CTTCAACATA AACGGCGTCC 23040 TAAGGGTCGA TTTAGTGTCATGCCTCTTTC ACCGCCACCA CCGAAGATGA AAGCAAAGAT 23100 CGGCTAAATA CCGCGTGTTCTCATCTAGAA GTGGGAACTT ACAGATGACA GTTCTTGCAT 23160 GGGCAGAACG AGGGGGACCGGGNACGCGGA AGCCTGCTTG AGGGRGGAGG GGYGGAAGGA 23220 GAGACAGCTT CAGGAAGAAAACAAAACACG AATACTGTCG GACACAGCAC TGACTACCCG 23280 GGTGATGAAA TCATCTGCACACTGAACACC CCCGTCACAA GTTTACCTAT GTCACAGTCT 23340 TGCTCATGTA TGCTTGAACGACAAATAAAA GTTCGGGGGG GAGAAGAGAG GAGAGAGAGA 23400 GAGAGACGGG GAGAGAGGGGGGAGAGGGGG GGGGAGAGAG AGAGAGAGAG AGAGAGAGAG 23460 AGAGAGAGAG AGAAAGAGAAGTAAAACCAA CCACCACCTC CTTGACCTGA GTCAGGGGGT 23520 TTCTGGCCTT TTGGGAGAACGTTCAGCGAC AATGCAGTAT TTGGGCCCGT TCTTTTTTTC 23580 TTCTTCTTCT TTTCTTTCTTTTTTTTTGGA CTGAGTCTCT CTCGCTCTGT CACCCAGGCT 23640 GCGGTGCGGT GGCGCTCTCTCGGCTCACTG AAACCTCTGC TTCCCGGGTT CCAGTGATTC 23700 TTCTTCGGTA GCTGGGATTACAGGTGCGCA CCATGACGGC CGGCTCATCG TTCTATTTTT 23760 AGTAGAGACG GGGTTTCTCCACGTTGGCCA CGCTGGTCTC GAACTCCTGA CCACAAATGA 23820 TCCACCTTCC TGGGCCTCCCAAAGTGCTGG AAACGACAGG CCTGAGCCGC CGGGATTTCA 23880 GCCTTTAAAA GCGCGCGGCCCTGCCACCTT TCGCTGCGGC CCTTACGCTC AGAATGACGT 23940 GTCCTCTCTG CCATAGGTTGACTCCTTGAG TCCCCTAGGC CATTGCACTG TAGCCTGGGC 24000 AGCAAGAGCC AAACTCCGTCCCCCCACCTC CCCGCGCACA TAATAACTAA CTAACTAACT 24060 AACTAACTAA AATCTCTACACGTCACCCAT AAGTGTGTGT TCCCGTGAGG AGTGATTTCT 24120 AAGAAATGGT ACTGTACACTGAACGCAGGC TTCACGTCTG TCATCCCGAG GTCAGGAGTT 24180 CGAGACCAGC CCGGCCCACGTGGTGAAACC CCCGTCTCTA CTGAAAATAC GAAATGGAGT 24240 CAGGCGCCGT GGGGCAGGCACCTGTAACCC CAGCTACTCG GGAGGCTGGG GTGGAAGAAT 24300 TGCTTGAACC TGGCAGGCGGAGGCTGCAGT GACCCAAGAT CGCACCACTG CACTACAGCC 24360 TGGGCGACAG AGTGAGACCCGGTCTCCAGA TAAATACGTA CATAAATAAA TACACACATA 24420 CATACATACA TACATACAACATACATACAT ACAGATATAC AAGAAAGAAA AAAAGAAAAG 24480 AAAAGAAAGA GAAAATGAAAGAAAAGGCAC TGTATTGCTA CTGGGCTAGG GCCTTCTCTC 24540 TGTCTGTTTC TCTCTGTTCGTCTCTGTCTT TCTCTCTGTG TCTCTTTCTC TGTCTGTCTG 24600 TCTGTCTGTC TGTCTGTCTCTTTCTTTCTT TCTGTCTCTG TCTTTGTCCC TCTCTCTCCC 24660 TCTCTGCCCT GTCTCACTGTGTCTGTCTTC TATCTTACTC TCTTTCTCTC CCCGTCTGTC 24720 TCTCTCTCAC TCCCTCCCTGTCTGTTTCTC TCTCTCTCTC TTTCTGTCTG TTTCTGTCTC 24780 TCTCTGTCTG CCTCTCTCTTTCTCTATCTG TCTCTTTCTC TGTCTGTCTG CCCCTCTCTT 24840 TCTTTTTCTG TGTCTCTCTGTCTGTCTCTC TCTCTCTCTG TGCCTATCTT CTGTCTTACT 24900 CTCTTTCTCT GCCTGTCTGTCTGTCTCTCT CTGTCTCTCC CTCCCTTTCT GCTTCTCTCT 24960 CTCTCTCTCT CTCTNNNCCCTCCCTGTCTG TTTCTCTCTG TCTCCCTCTC TTTCTGTCTG 25020 TTTCTCACTG TCTCTCTCTGTCTGTCTGTT TCATTCTCTC TGTCTCTGTC TCTGTCTCTC 25080 TCTCTCTCTG TCTCTCCCTCTCTGTGTGTA TCTTTTGTCT TACTCTCCTT CTCTGCCTGT 25140 CCGTCTGTCT GTCTGTCTCTCTCTCTCCCT GTCCCTCTCT CTTTCTGTCT GTTTCTCTCT 25200 CTCTCTCTCT CTCTCTCTCTCTGTCTCTGT CTTTCTCTGT CTGTCCCTTT CTCTGTCTGT 25260 CTGCCTCTCT CTTTCTCTTTCTGTGTCTCT CTGTCTCTCT CTCTGTGCCT ATCTTCTGTC 25320 TTACTCTCTT TCTCTGCCTGTCTATCTGTC TGTCTCTCTC TGTCTCTCTC CCTGCCTTTC 25380 TGTTTCTCTC TCTCTCCCTCTCTCGCTCTC TCTGTCTTTC TCTCTTTCTC TCTGTTTCTC 25440 TGTCTCTCTC TGTCCGTCTCTGTCTTTTTC TGTCTGTCTG TCTCTCTCTT TCTTTCTGTC 25500 GTCTGTCTCT GTCTCTGTCTCTGTCTCTCT CTCTCTCTCT CTCCTTGTCT CTCTCACTGT 25560 GTCTGTCTTC TGTCTTACTCTCCTTCTCTG CCTGTCCATC TGTCTGTCTG TCTCTCTCTC 25620 TCTCTCCCTA CCTTTCTGTTTCTCTCTCGC TAGCTCTCTC TCTCTCTGCC TGTTTCTCTC 25680 TTTCTCTCTC TGTCTTTCTCTGTCTGTCTC TTTCTCTGTC TGTCTGTCTC TTTCTCTCTG 25740 TCTCTGTCTC TGTCTCTCTCTCTCTCTCTC TCTCTCTCTC TGCCTCTCTC ACTGTGTCTG 25800 TCTTCTGTCT TATTCTCTTTCTCTCTCTGT CTCTCTCTCT CTCTCCTTTA CTGTCTGTTT 25860 CTCTCTCTCT CTCTCTCTTTCTGCCTGTTT CTCTCTGTCT GTCTCTGTCT TTCTCTGTCT 25920 GTCTGCCTCT CTCTTTCTTTTTCTGCGTCT CTCTGTCTCT CTCTCTCTCT CTCTGTTCCT 25980 ATCTTCTGTC TTACTCTGTTTCCTTGCCTG CCTGCCTGTC TGTGTGTCTG TCTCTCTCTC 26040 TCTCTCTCTC TCTCTCTCCCTCCCTTTCTC TTTCTCTGTC TCTCTCTCTC TTTCTGGGTG 26100 TTTCTCTCTG TCTCTCTGTCCATCTCTGTC TTTCTATGTC TGTCTCTCTC TTTCTCTCTG 26160 TCTCTGTCTC TGCCTCTCTCTCTCTCTCTC TCTCTCTCTC TCTGTCTGTC TCTCTCACTG 26220 TGTGTGTCTG TCTTCTGTCTTACTCTCCTT CTCTGCCTGT CCGTCTGTCT GTCTGTCTCT 26280 CCCTCTCTCT CCCTCCCTTTCTGTTTCTCT CTCTCTCTCT TTCTGTCTGT TTCTCTCTTT 26340 CTCTCTCTGT CTGTCTCTTTCTCTGTCTGT CTGTCTCTCT CTTTCTTTTT CTCTGTCTCT 26400 CTGTCTCTCT CTGTGTCTGTCTCTCTGTCT GTGCCTATCT TCTGTCTTAC TCTCTTTCTC 26460 TGGCTGTCTG CCTGTCTCTCTCTCTCTCTC TGTCTGTCTC CGTCCCTCTC TCCCTGTCTG 26520 TCTGTTTCTC TCTCTGCCTCTCTCTCTCTC TGTCTGTCTC TTTCTCTGTC TGTCTGTCTC 26580 TCTCTTTCTT TTTCTCTGTCTCTCTGTCTC TCTCTGTGTC TGTCTCTCTT TCTGTGCCTA 26640 TCTTCTGTCT TACTCTCTTTCTCTGGCTGT CTGCCTGTCT CTCTCTCTCT GCCTGTCTCC 26700 GTCCCTCCCT CCCTGTCTGTCTGTTTCTCT CTCTGTCTCT GTCTCTCTGT CCATCTCTGT 26760 CTGTCTCTTT CTCTTTCTCTCTCTCTGTCT CTGTCTCTCT CTCTCTCTGC CTGTCTCTCT 26820 CACTGTGTCT GTCTTCTGTCTTACTCTCTT TCTCTTGCCT GCCTCTCTGT CTGTCTGTCT 26880 CTCTCCCTCC ATGTCTCTCTCTCTCTCTCA CTCACTCTCT CTCCGTCTCT CTCTCTTTCT 26940 GTCTGTTTCT CTCTCTGTCTGTCTCTCTCC CTCCATGTCT CTCTCTCTCT CTCTCACTCA 27000 CTCTCTCTCC GTCTCTCTCTCTCTTTCTGT CTGTTTCTCT CTCTGTCTGT CTCTCTCCCT 27060 CCATGTCTCT CTCTCTCCCTCTCACTCACT CTCTCTCCGT CTCTCTCTCT CTTTCTCTCT 27120 GTTTCTTTGT CTGTCTGTCTGTCTGTCTGT CTGTCTCTCT CTCTCTCTCT CTCTCTCTCT 27180 CTCTCTGTTT GTCTTTCTCCCTCCCTGTCT GTCTGTCTGT CTCTCTCTCT CTGTCTCTGT 27240 CTCTGTCTCT CTCTCTTTCTCTTTCTGTCT GTTTCTCTCT ATCTCTCGCT GTCCATCTCT 27300 GTCTTTCTAT GTCTGTCTCTTTCTCTGTCA GTCTGTCAGA CACACCCGTG CCGGTAGGGC 27360 CCTGCCCTTC CACGAGAGTGAGAAGCGCGT GCTTCGGTGC TTAGAGAGGC CGAGAGGAAT 27420 CTAGACAGGC GGGCCTTGCTGGGCTTCCCC ACTCGGTGTA CGATTTCGGG AGGTCGAGGC 27480 CGGGTCCCCG CTTGGATGCGAGGGGCATTT TCAGACTTTT CTCTCGGTCA CGTGTGGCGT 27540 CCGTACTTCT CCTATTTCCCCGATAAGTCT CCTCGACTTC AACATAAACT GTTAAGGCCG 27600 GACGCCAACA CGGCGAAACCCCGTCTCTAC TAAAAATACA AAGCTGAGTC GGGAGCGGTG 27660 GGGCAGGCCC TGTAATGCCAGCTCCTCGGG AGGCTGAGGC GGGAGAATCG CTTGAACCAG 27720 GGAAGCGGAG GCTGCAGGGAGCCGAGATCG CGCCACTGCA CTACGGCCCA GGCTGTAGAG 27780 TGAGTGAGAC TCGGTCTCTAAATAAATACG GAAATTAATT AATTCATTAA TTCTTTTCCC 27840 TGCTGACGGA CATTTGCAGGCAGGCATCGG TTGTCTTCGG GCATCACCTA GCGGCCACTG 27900 TTATTGAAAG TCGACGTTGACACGGAGGGA GGTCTCGCCG ACTTCACCGA GCCTGGGGCA 27960 ACGGGTTTCT CTCTCTCCCTTCTGGAGGCC CCTCCCTCTC TCCCTCGTTG CCTAGGGAAC 28020 CTCGCCTAGG GAACCTCCGCCCTGGGGGCC CTATTGTTCT TTGATCGGCG CTTTACTTTT 28080 CTTTGTGTTT TGGCGCCTAGACTCTTCTAC TTGGGCTTTG GGAAGGGTCA GTTTAATTTT 28140 CAAGTTGCCC CCCGGCTCCCCCCACTACCC ACGTCCCTTC ACCTTAATTT AGTGAGNCGG 28200 TTAGGTGGGT TTCCCCCAAACCGCCCCCCC CCCCCCGCCT CCCAACACCC TGCTTGGAAA 28260 CCTTCCAGAG CCACCCCGGTGTGCCTCCGT CTTCTCTCCC CTTCCCCCAC CCCTTGCCGG 28320 CGATCTCATT CTTGCCAGGCTGACATTTGC ATCGGTGGGC GTCAGGCCTC ACTCGGGGGC 28380 CACCGTTTTT GAAGATGGGGGCGGCACGGT CCCACTTCCC CGGAGGCAGC TTGGGCCGAT 28440 GGCATAGCCC CTTGACCCGCGTGGGCAAGC GGGCGGGTCT GCAGTTGTGA GGCTTTTCCC 28500 CCCGCTGCTT CCCGCTCAGGCCTCCCTCCC TAGGAAAGCT TCACCCTGGC TGGGTCTCGG 28560 TCACCTTTTA TCACGATGTTTTAGTTTCTC CGCCCTCCGG CCAGCAGAGT TTCACAATGC 28620 GAAGGGCGCC ACGGCTCTAGTCTGGGCCTT CTCAGTACTT GCCCAAAATA GAAACGCTTT 28680 CTGAAAACTA ATAACTTTNCTCACTTAAGA TTTCCAGGGA CGGCGCCTTG GCCCGTGTTT 28740 GTTGGCTTGT TTTGTTTCGTTCTGTTTTGT TTTGTTCGTG TTTTTCCTTT CTCGTATGTC 28800 TTTCTTTTCA GGTGAAGTAGAAATCCCCAG TTTTCAGGAA GACGTCTATT TTCCCCAAGA 28860 CACGTTAGCT GCCGTTTTTTCCTGTTGTGA ACTAGCGCTT TTGTGACTCT CTCAACGCTG 28920 CAGTGAGAGC CGGTTGATGTTTACNATCCT TCATCATGAC ATCTTATTTT CTAGAAATCC 28980 GTAGGCGAAT GCTGCTGCTGCTCTTGTTGC TGTTGTTGTT GTTGTTGTTG TCGTCGTTGC 29040 TGTTGTCGTT GTCGTTGTTGTTGTCGTTGT CGTTGTTTTC AAAGTATACC CCGGCCACCG 29100 TTTATGGGAT CAAAAGCATTATAAAATATG TGTGATTATT TCTTGAGCAC GCCCTTCCTC 29160 CCCCTCTCTC TGTCTCTCTGTCTGTCTCTG TCTCTCTCTT TCTCTGTCTG TCTTCTCTCT 29220 CTCTCTCTCT CTGTGTCTCTCTCTCTCTGC CTGTCTGTTT CTCTCTCTCT GCCTCTCTCT 29280 CTCTCTCTCT CTCTGCCTGTCTCTCTCACT GTGTCTGTCT TCTGTCTTAC TCCCTTTCTC 29340 TGTCTGTCTG TCGGTCTCTCTCTCTCTCTC TCCCTGTCTG TATGTTTCTC TCTGTCTCTG 29400 TCTCTCTCTC TCTTTCTGTTTCTCTCTCTC CGTCTCTGTC TTTCTCTGAC TGTCTCTCTC 29460 TTTCCTTCTC TCTGTCTCTCTCTGCCTGTC TCTCTCACTC TGTCTTCTGT CTTATCTCTC 29520 TCTCTGCCTG CCTGTCTCTCTCACTCTCTC TCTCTGTGTG TCTCTCTCTC TCTTTCTGTT 29580 TCTCTCTGTC TCTCTGTCCGTCTCTGTCTT TCTCTGTCTG TCTCTTTGTC TGTCTGTCTT 29640 TGTCTTTCCT TCTCTCTGTCTCTGTCTCTC TCACTGTGTC TGTCTTCTGT CTTAGTCTCT 29700 CTCTCTCTCT CTCCCTGTCTGTCTGTCTCT CTCTCTCTCT CCCCCTGTCT GTTTCTCTCT 29760 CTCTCTCTCT CTCTCTCTCTCTCTGTCTTT GTCTTTCTTT CTGTCTCTGT CTCTCTCTCT 29820 CTCTCTGTGT GTCTGTCTTCTGTCTTACTG TCTTTCTCTG CCTGTCTGTC TGTCTGTCTC 29880 TCTCTGTCTG TCTCTCTCTCTCTCTCCCCC TGTCGGCTGT TTCTCTGTCT CTGTCTGTGT 29940 CTCTCTTTCT GTCTGTTTCTCTCTGTCTGT CTTTCTCTCT CTGTCTCTTT CTCTCTGTCT 30000 CTCTGTCTGT CTCTGTCTCTCTCTCTGTCT CTCTCTCTCT GTGGGGGTGT GTGTGTGTGT 30060 GTGTATGTGT GTGTGTGTGTGTGTGTGTGT CTGCCTTCTG TCTTACTCTC TTTCTCTGCC 30120 TGTCTGTCTG CCTGTCTGTTTGTCTCTCTC TCTCTGCCTG TCTCTCTCCC TTCCTGTCTG 30180 TTTCTCTCTC TTTCTGTTTCTCTCTGTCTC TGTCCATCTC TGTCTTTCTC CGTCTGTCTC 30240 TTTATCTGTC TCTCTCCGTCTGTCTCTTTA TCTGTCTCTC TCTCTCTTTC TGTCTTTCTC 30300 TCTCTGTGTA TCGTTGTCTCTCTCTGTCTG TCTCTGTCTC TGTCTCTCTG TCTCTCTCTC 30360 TCTCTCTCTC TCTCTGTCTGTCTGTCCGTC TGTCTGTCTC GGTCTCTGCG TCTCGCTATC 30420 TCCCGCCCTC TCTTTTTTTGCAAAAGAAGC TCAAGTACAT CTAATCTAAT CCCTTACCAA 30480 GGCCTGAATT CTTCACTTCTGACATCCCAG ATTTGATCTC CCTACAGAAT GCTGTACAGA 30540 ACTGGCGAGT TGATTTCTGGACTTGGATAC CTCATAGAAA CTACATATGA ATAAAGATCC 30600 AATCCTAAAA TCTGGGGTGGCTTCTCCCTC GACTGTCTCG AAAAATCGTA CCTCTGTTCC 30660 CCTAGGATGC CGGAAGAGTTTTCTCAATGT GCATCTGCCC GTGTCCTAAG TGATCTGTGA 30720 CCGAGCCCTG TCCGTCCTGTCTCAAATATG TACGTGCAAA CACTTCTCTC CATTTCCACA 30780 ACTACCCACG GCCCCTTGTGGAACCACTGG CTCTTTGAAA AAAATCCCAG AAGTGGTTTT 30840 GGCTTTTTGG CTAGGAGGCCTAAGCCTGCT GAGAACTTTC CTGCCCAGGA TCCTCGGGAC 30900 CATGCTTGCT AGCGCTGGATGAGTCTCTGG AAGGACGCAC GGGACTCCGC AAAGCTGACC 30960 TGTCCCACCG AGGTCAAATGGATACCTCTG CATTGGCCCG AGGCCTCCGA AGTACATCAC 31020 CGTCACCAAC CGTCACCGTCAGCATCCTTG TGAGCCTGCC CAAGGCCCCG CCTCCGGGGA 31080 GACTCTTGGG AGCCCGGCCTTCGTCGGCTA AAGTCCAAAG GGATGGTGAC TTCCACCCAC 31140 AAGGTCCCAC TGAACGGCGAAGATGTGGAG CGTAGGTCAG AGAGGGGACC AGGAGGGGAG 31200 ACGTCCCGAC AGGCGACGAGTTCCCAAGGC TCTGGCCACC CCACCCACGC CCCACGCCCC 31260 ACGTCCCGGG CACCCGCGGGACACCGCCGC TTTATCCCCT CCTCTGTCCA CAGCCGGCCC 31320 CACCCCACCA CGCAACCCACGCACACACGC TGGAGGTTCC AAAACCACAC GGTGTGACTA 31380 GAGCCTGACG GAGCGAGAGCCCATTTCACG AGGTGGGAGG GGTGGGGGTG GGGTGGGTTG 31440 GGGGTTGTGG GGTCTGTGGCGAGCCCGATT CTCCCTCTTG GGTGGCTACA GGCTAGAAAT 31500 GAATATCGCT TCTTGGGGGGAGGGGCTTCC TTAGGCCATC ACCGCTTGCG GGACTACCTC 31560 TCAAACCCTC CCTTGAGGCCACAAAATAGA TTCCACCCCA CCCATCGACG TTTCCCCCGG 31620 GTGCTGGATG TATCCTGTCAAGAGACCTGA GCCTGACACC GTCGAATTAA ACACCTTGAC 31680 TGGCTTTGTG TGTTTGTTTGTTTCTGAGAT GGAGTCTTGC TCTGTCCCCC AGGCTGGCGT 31740 GCAGTGGCGT GATCTCAGCTCACTGGAACC TCTGCCTCCT GGGTTCAAGT GATTCTCCTG 31800 TCTCAGCGCC ACCATGGCCGGCTCATTTTT TTTTTTTTTT TTTTTGGTAG ACACGGGGTT 31860 TCACCCTCTT TCATTGGTTTTCACTGGAGA TTCTAGATTC GAGCCACACC TCATTCCGTG 31920 CCACAGAGAG ACTTCTTTTTTTTTTTTTTT TTTTTAAGCG CAACGCAACA TGTCTGCCTT 31980 ATTTGAGTGG CTTCCTATATCATTATAATT GTGTTATAGA TGAAGAAACG GTATTAAACA 32040 CTGTGCTAAT GATAGTGAAAGTGAAGACAA AAGAAAGGCT ATCTATTTTG TGGTTAGAAT 32100 AAAGTTGCTC AGTATTTAGAAGCTACCTAA ATACGTCAGC ATTTACACTC TTCCTAGTAA 32160 AAGCTGGCCG ATCTGAATAATCCTCCTTTA AACAAACACA ATTTTTGATA GGGTTAAGAT 32220 TTTTTTAAGA ATGCGACTCCTGCAAAATAG CTGAACAGAC GATACACATT TAAAAAAATA 32280 ACAACACAAG GATCAACCAGACTTGGGAAA AAATCGAAAA CCACACAAGT CTTATGAAGA 32340 ACTGAGTTCT TAAAATAGGACGGAGAACGT AGCTATCGGA AGAGAAGGCA GTATTGGCAA 32400 GTTGATTGTT ACGTTGGTCAGCAGTAGCTG GCACTATCTT TTTGGCCATC TTTCGGGCAA 32460 TGTAACTACT ACAGCAAAATGAGATATGAT CCATTAAACA ACATATTCGC AAATCAAAAA 32520 GTGTTTCAGT AATATAATGCTTCAGATTTA GAAGCAAATC AAATGATAGA ACTCCACTGC 32580 TGTAATAAGT CACCCCAAAGATCACCGTAT CTGACAAAAT AACTACCACA GGGTTATGAC 32640 TTCAGAATCA TACTTTCTTCTTGATATTTA CTTATGTATT TATTTTTTTT AATTTATTTC 32700 TCTTGAGACG CGTCTCGCTCTGTCGCCCAG GCTGGAGTGC GATGGTGTGA TCTCGGCTCA 32760 CTGCAACCGC CACCTCCCTGGGTTCAAGCG ATTCTCCTGC CTCAGCCTCC CGAGTAGCTG 32820 GGACTACAGG TGCCCGCCACCACGCCCAGC TAATCTTTAT ACTTTTAATA GAGACGGGGT 32880 TTCACCGTGT CGGCCCGGATGGTCTCGATC TCTTGACCTC GTGACCCGCC CGCCTCGGCC 32940 TCCCAAAGTG CTGGGATGACAGGCGTGAGC CACTGAGCCC GGCCTTCTCT TGACGTTTAA 33000 ACTATGAAGT CAGTCCAGAGAAACGCAATA AATGTCAACG GTGAGGATGG TGTTGAGGCA 33060 GAAGTAGGAC CACACTTTTTCCTATCTTAT TCAGTTGATA ACAATATGAC CTAGGTAGTA 33120 ATTTCCTATG TGCCTACTTATACACGAGTA CAAAAGAGTA AAACAGAGAG ACTGCTAAAT 33180 TAAAGGGTAC GTGAAGTTCTTCATAGTAAC TCCGTAAACT GGAACACTGT CAAAAAGCAG 33240 CAGCTAGTGA ATTGTTTCCATGTATTTTTC TATTATCCAA TAAGTGAACT ATGCTATTCC 33300 TTTCCAGTCT CCCAAGCACTTCTTGTCCCC ATCACCACTT CGGTGCTCGA AGAAAAAGTA 33360 AGCAAATCAA GGAACACAAGCTAAAGAAAC ACACACACAA ACCAAAGACA ACTACAGCGT 33420 CTGCAAAAGT TTGCTAGAAGACTGAAACTG TTGAGTATAA GGATCTGGTA TTCTACGATC 33480 ATGAGTTCAC TTCAGAGTTTGTTCAAGACA TACGTTTCGT AAGGAAACAT CTTAGTTAGA 33540 AGTTATTCAG CAGTAGGTACCATCCCTAAG TATTTTTCAC CAAATCCGTG ACAATAAAGA 33600 GCTATCTAAC CAGAAAAATTAGCGAGTACG GGCACCATCC ATAGGGCTTT GTCTTTACGC 33660 TTCATTAGCA CTTACCATGCCTTACAATGT CTAGGATTGA CCCTGATAGC ATTTCGAAAA 33720 CAAGCTAATG CTTTGTCCAGTTCTTCAGTG AAGACAACTC ACGCCCTAAT GCGCTATAGG 33780 CATAAGCATC ATTTGGATCCACTTCGAGAG TTCTCTGGAA GAATTGAATC GCAATATCTG 33840 GTTCCCGTTT GCAGACCGAAACAGTTTCCC TGCAGCACAC CAGGCCTCTG GCTGGCGAAT 33900 TTTTATCCAT GTCTGTGAAGTCTTTGGACA GAACTGAAAG AGCAACCTCT TTCGGAGGAT 33960 GCCAAAGTGT TGTAGAGTAGATCTCCATGC CTTCGACTCT GTAATTCTCA ATCCTCCTAA 34020 CCTCTGAGAA TTGTCTTTCAGCTTGCGTGG ACTCTGAAAG TTTACAATAG GCCNTTTCCG 34080 ATTTGGCACA GTACCCAACCGGTATTGCAG TGGTGAGAAG CTAGATGGCT CAAGATGCTG 34140 ATAGCTTCTT TGCCGTGGTAAGAACACAAA GCTAAATAAC CTTTCCCCCT TTCACGAAGA 34200 AGGCTCATCA AGCCTTCCGCTGCTGCTTTT TGTAGATTAA AAGCCTGAAT CTGAGGCGCG 34260 ATTGCGGCTA TTTTCCCTTCTGAAATGACG GAAGAGTCCA ATTTTGTCAC TTCCAGGCTA 34320 TCACTTATGT TCGGTGGAGTTATTGCTCCT TTATTAGTTT TACTTTTGGT TCTTCTGTTT 34380 GGGATTTTAG GTGGAAACTTCATTTTTAAT TTTCTCCTAA TTCTCCTCGG TTGTGGAGCT 34440 GTCACTAGTC AAGAGTCGTGAATTTCTTCG AGGNCGGTGC ATTTGGGGGA GATGCCATAG 34500 TGGGGCTCAA TACCTGAGGTGTTGCCCTTG TCGGCGGACC AGAACTTTGT GTTTTTGCAA 34560 GGACTGGAGT TACCTTTCGGCTCTTTCCCC TCTGCGAGAA GACAGACGGT GTTCCGGTTT 34620 GGCCGATTCT GGCAACAGGCTTTTCTGAAG GGGCTCCGGT GGATGGCACG TCAGTGACAG 34680 ACGGTGTCTC ATACCAGTGCAGTTTTGTCA ATAGGGTCCG TCTCCGGGAC TTGGGGTTTC 34740 TAATGGCAAA ATGCCAACACTTGGGGTTAA TGGACTAACA GCTGCTGGTC CTCCTAATAA 34800 ACTTCGACCA GTTTTTGGTTTATGTTGAAC CTGTTTAGAT CATATGGAAG TTCCTGTTCC 34860 CAGTGGGACA GTATCAGGTGAAAGGACAGC TGAATCGATA GAAGACACTG GGGAGTCTGT 34920 ATTCAAGGAG TACTTTGAATTGGAAGATTC TAAATTCCAT CCGTTTCATT CGACGGTGTC 34980 CTGGGGTGTT TCCGTAAGAACGGTCTCGGG CTGTCTGTGA CATAAACTAG GACGAGGTCC 35040 AAGTGTTGTG GCGCAACACTTGGACAGGCA GTTGCTAAAG CTCTCTAGAG AGGTGAATCA 35100 AAATGTTTGG TCAGGATCTGGCTTTTCCCC CCTATTTCAC ATCATGATTC AAAGGGACAC 35160 CAGAGGAAAG GATTTCAACGAAGGCTCTTT TGGTCACATT CTGATCCTTT GGTAAGCCGA 35220 TCTGTCTTGC AATATACATGTCCCGACGAT GGAAGGGGAA AGCGAGCTGA ATCACCAAAC 35280 TCAGGAACGA TAATATCATCGTGGCTTTTC TGCTTATGAA ACACTCCACC CGATAAGATT 35340 TGATCCCCTT CTGCAAGCTTGCTGAGATCA ACACAACATT TCGCAAGCAG GCATTTGCAT 35400 TGCGGGGTAG TACAACTGTGTCCTTTCAAG AGTCTATATG TTTTATAGGC CTTTCCTGAG 35460 CGGTAAGAAC AGGTCGCCAGTAAGAACAAG GCTTCTTCTG AGTGTACTTC TGCATAAAGG 35520 CGTTCTGCGG GGGAAACCGCATCTCGGTAG GCATAGTGGT TTAGTGCTTG CCATATAGCA 35580 GCCTGGACGG GTCCCTGCAGCACCGCCATC CTCGAGGCTC AGGCCCACTT TCTGCAGTGC 35640 CACAGGCACC CCCCCCCCCCCATAGCGGCT CCGGCCCGGC CAGCCCCGGC TCATTTAAAG 35700 GCACCAGCCG CCGTTACCGGGGGATGGGGG AGTCCGAGAC AGAATGACTT CTTTATCCTG 35760 CTGACTCTGG AAAGCCCGGCGCCTTGTGAT CCATTGCAAA CCGAGAGTCA CCTCGTGTTT 35820 AGAACACGGA TCCACTCCCAAGTTCAGTGG GGGGATGTGA GGGGTGTGGC AGGTAGGACG 35880 AAGGACTCTC TTCCTTCTGATTCGGTCTGC ACAGTGGGGC CTAGGGCTGG AGCTCTCTCC 35940 GTGCGGACCG CTGACTCCCTCTACCTTGGG TTCCCTCGGC CCCACCCTGG AACGCCGGGC 36000 CTTGGCAGAT TCTGGCCCTTTCTGGCCCTT CAGTCGCTGT CAGAAACCCC ATCTCATGCT 36060 CGGATGCCCC GAGTGACTGTGGCTCGCACC TCTCCGGAAA CATTGGAAAT CTCTCCTCTA 36120 CGCGCGGCCA CCTGAAACCACAGGAGCTCG GGACACACGT GCTTTCGGGA GAGAATGCTG 36180 AGAGTCTCTC GCCGACTCTCTCTTGACTTG AGTTCTTCGT GGGTGCGTGG TTAAGACGTA 36240 GTGAGACCAG ATGTATTAACTCAGGCCGGG TGCTGGTGGC TCACGCCTGT AACCCCAACA 36300 CTTTGGGAGG CCGAGGCCGTAGGATCCCTC GAGGAATCGC CTAACCCTGG GGAGGTTGAG 36360 GTTGCAGTGA GTGAGCCATAGTTGTGTCAC TGTGCTCCAG TCTGGGCGAA AGACAGAATG 36420 AGGCCCTGCC ACAGGCAGGCAGGCAGGCAG GCAGGCAGAA AGACAACAGC TGTATTATGT 36480 TCTTCTCAGG GTAGGAAGCAAAAATAACAG AATACAGCAC TTAATTAATT TTTTTTTTTT 36540 CCTTCGGACG GAGTTTCACTCTTGGTGCCC ACGCTGGAGT GCAGTGGCAC CATCTCGGCT 36600 CACCGCAACC TCCACCTCCCGCGTTCAAGC GATTCTCCTG CCTCAGCCTC CTGAGTAGCT 36660 GGGATTACAG GGAGGAGCCACCACACCCAG CTGATTTTGT ATTGTTAGTA GAGACGGCAT 36720 TTCTCCATGT GGGTCAGGCTGGTCTCGAAC TGGCGACCCC AGTGGATCTG CCCGCCCCGG 36780 CCTCCCAAAG TGCTGGGGTGACAGGCGTGA GCCATCGTGA CTGGCCGGCT ACGTTTATTT 36840 ATTTATTTTT TTAATTATTTTACTTTTTTT TAGTTTTCCA TTTTAATCTA TTTATTTATT 36900 TACATTTATT TATTTATTTATTTATTTACT TATTTATTTA TTTTCGAGAC AGACTCTCGC 36960 TCTGCTGCCC AGGCTGGAGTGCAGCGGCGT GATCTCGGCT CACTGCAACG TCCGCCTCCC 37020 GGGTTCACGC CATTCTCCTGCCTCAGCCTC CCAAGTAGCT GGGACTACAG GCGCCCGCCA 37080 CCGTGCCCGG CTAACTTTTTGTATTTTGAG TAGAGATGGG GTTTCACTGT GGTAGCCAGG 37140 ATGGTCTCGA TCTCCTGACCCCGTGATCCG TCCACCTCGG CCTCCCAAAG TGCTGGGATG 37200 ACAGGCGTGA GCCACCGGCCCCGGCCTATT TATCTATTTA TTAACTTTGA GTCCAGGTTA 37260 TGAAACCAGT TAGTTTTTGTAATTTTTTTT TTTTTTTTTT TTTTTTGAGA CGAGGTTTCA 37320 CCGTGTTGCC AAGGCTTGGACCGAGGGATC CACCGGCCCT CGGCCTCCCA AAAGTGCGGG 37380 GATGACAGGC GCGAGCCTACCGCGCCCGGA CCCCCCCTTT CCCCTTCCCC CGCTTGTCTT 37440 CCCGACAGAC AGTTTCACGGCAGAGCGTTT GGCTGGCGTG CTTAAACTCA TTCTAAATAG 37500 AAATTTGGGA CGTCAGCTTCTGGCCTCACG GACTCTGAGC CGAGGAGTCC CCTGGTCTGT 37560 CTATCACAGG ACCGTACACGTAAGGAGGAG AAAAATCGTA ACGTTCAAAG TCAGTCATTT 37620 TGTGATACAG AAATACACGGATTCACCCAA AACACAGAAA CCAGTCTTTT AGAAATGGCC 37680 TTAGCCCTGG TGTCCGTGCCAGTGATTCTT TTCGGTTTGG ACCTTGACTG AGAGGATTCC 37740 CAGTCGGTCT CTCGTCTCTGGACGGAAGTT CCAGATGATC CGATGGGTGG GGGACTTAGG 37800 CTGCGTCCCC CCAGGAGCCCTGGTCGATTA GTTGTGGGGA TCGCCTTGGA GGGCGCGGTG 37860 ACCCACTGTG CTGTGGGAGCCTCCATCCTT CCCCCCACCC CCTCCCCAGG GGGATCCCAA 37920 TTCATTCCGG GCTGACACGCTCACTGGCAG GCGTCGGGCA TCACCTAGCG GTCACTGTTA 37980 CTCTGAAAAC GGAGGCCTCACAGAGGAAGG GAGCACCAGG CCGCCTGCGC ACAGCCTGGG 38040 GCAACTGTGT CTTCTCCACCGCCCCCGCCC CCACCTCCAA GTTCCTCCCT CCCTTGTTGC 38100 CTAGGAAATC GCCACTTTGACGACCGGGTC TGATTGACCT TTGATCAGGC AAAAACGAAC 38160 AAACAGATAA ATAAATAAAATAACACAAAA GTAACTAACT AAATAAAATA AGTCAATACA 38220 ACCCATTACA ATACAATAAGATACGATACG ATAGGATGCG ATAGGATACG ATAGGATACA 38280 ATACAATAGG ATACGATACAATACAATACA ATACAATACA ATACAATACA ATACAATACA 38340 ATACAATACA ATACAATACGCCGGGCGCGG TGGCTCATGC CTGTCATCCC GTCACTTTGG 38400 GATGCCGAGG TGGACGCATCACCTGAAGTC GGGAGTTGGA GACAAGCCCG ACCAACATGG 38460 AGAAATCCCG TCTCAATTGAAAATACAAAA CTAGCCGGGC GCGGTGGCAC ATGCCTATAA 38520 TCCCAGCTGC TAGGAAGGCTGAGGCAGGAG AATCGCTTGA ACCTGGGAAG CGGAGGTTGC 38580 AGTGAGCCGA GATTGCGCCATCGCACTCCA GTCTGAGCAA CAAGAGCGAA ACTCCGTCTC 38640 AAAAATAAAT ACATAAATAAATACATACAT ACATACATAC ATACATACAT ACATACATAC 38700 ATAAATTAAA ATAAATAAATAAAATAAAAT AAATAAATGG GCCCTGCGCG GTGGCTCAAG 38760 CCTGTCATCC CCTCACTTTGGGAGGCCAAG GCCGGTGGAT CAAGAGGCGG TCAGACCAAC 38820 AGGGCCAGTA TGGTGAAACCCCGTCTCTAC TCACAATACA CAACATTAGC CGGGCGCTGT 38880 GCTGTGCTGT ACTGTCTGTAATCCCAGCTA CTCGGGAGGC CGAGCTGAGG CAGGAGAATC 38940 GCTTGAACCT GGGAGGCGGAGGTTGCAGTG AGCCGAGATC GCGCCACTGC AACCCAGCCT 39000 GGGCGACAGA GCGAGACTCCGTCTCCAAAA AATGAAAATG AAAATGAAAC GCAACAAAAT 39060 AATTAAAAAG TGAGTTTCTGGGGAAAAAGA AGAAAAGAAA AAAGAAAAAA ACAACAAAAC 39120 AGAACAACCC CACCGTGACATACACGTACG CTTCTCGCCT TTCGAGGCCT CAAACACGTT 39180 AGGAATTATG CGTGATTTCTTTTTTTAACT TCATTTTATG TTATTATCAT GATTGATGTT 39240 TCGAGACGGA GTCTCGGAGGCCCGCCCTCC CTGGTTGCCC AGACAACCCC GGGAGACAGA 39300 CCCTGGCTGG GCCCGATTGTTCTTCTCCTT GGTCAGGGGT TTCCTTGTCT TTCTTCGTGT 39360 CTTTAACCCG CGTGGACTCTTCCGCCTCGG GTTTGACAGA TGGCAGCTCC ACTTTAGGCC 39420 TTGTTGTTGT TGGGGACTTTCCTGATTCTC CCCAGATGTA GTGAAAGCAG GTAGATTGCC 39480 TTGCCTGGCC TTGCCTGGCCTTGCCTTTTC TTTCTTTCTT TCTTTCTTTA TTACTTTCTC 39540 TTTTTCTTCT TCTTCTTCTTCTTTTTTTTG AGACAGAGTT TCACTCTTGT TGCCCAGGTC 39600 AGAGGGCAAT GGCGCGATCTCGGCTCACCG CACCCTCCGC CTCCCAGGTT CAAGCGATTC 39660 TCCTGCCTCA GCCTCCTGATTAGCTGGGAT TACAGGCATG GGCCACCGTG CTGGCTGATG 39720 TTTGTACTTT TAGTAGAGACGGTGTTTTTC CATGTTGGTC AGGCTGGTCT CCCACTCCCA 39780 ACCTCAGGTG GTCCGCCTGCCTTAGCCTCC CAAAGTGCTG GGATGACAGG CGTGCAACCG 39840 CGCCCAGCCT CTCTCTCTCTCTCTCTCTCT CTCGCTCGCT TGCTTGCTTG CTTTCGTGCT 39900 TTCTTGCTTT CCCGTTTTCTTGCTTTCTTT CTTTCTTTCG TTTCTTTCAT GCTTGCTTTC 39960 TTGCTTGCTT GCTTGCTTTCGTGCTTTCTT GCTTTCCTGT TTTCTTTCTT TCTTTCTTTC 40020 TTTCTTTCTT TTGTTTCTTTCTTGCTTGCT TTCTTGCTTG CTTGCTTGCT TTCGTGCTTT 40080 CTTGCTTTCC TGTTTTCTTTCTTTCTTTCT TTCTTTTCTT TCTTTCTTGC TTGCTTTCCT 40140 GCTTGCTTGC TTTCGTGCTTTCTTGTTTTC TCGATTTCTT TCTTTCTTTT GTTTCTTTCC 40200 TGCTTGCTTT CTTGCTTGCTTGCTTTCGTG CTTCTTGCTT TCCTGTTTTC TTTCTTTCTT 40260 TCTTTCTTTT GTTTCTTTCTTGCTTGCTTT CTTGCTTGCT TGCTTTCGTG CTGTCTTGTT 40320 TCTCGATTTC TTTCTTTCTTTTGTTTCTTT CCTGCTTGCT TTCTTGCTTG ATTGCTTTGG 40380 TGCTTTCTTG CTTTCTTGTTTTCTTTCTTT CTTTTGTTTC TTTCTTTCTT GCTTCCTTGT 40440 TTTCTTGCTT TCTTGCTTGCTTGCTTTCGT GCTTTCTTGT TTTCTTGCTT TCTTTCTTTT 40500 GTTTCTTTCT TGCTTGCTTTCTTGCTTCCT TGTTTTCTTG CTTTCTTGCT TGCTTGCTTT 40560 CGTGCTTTCT TTCTTGCTTTCTTTTCTTTC TTTCTTTTCT TTTTCTTTCT TTCTTGCTTT 40620 CTTTTCTTTC ATCATCATCTTTCTTTCTTT CCTTTCTTTC TTTCTTTCTT TCTATCTTTC 40680 TTTCTTTCTT TCTTTCTTTCTTTCTTTCTT TCTTTCTGTT TCGTCCTTTT GAGACAGCGT 40740 TTCACTCTTG TTTCCACGGCTAGAGTGCAA TGGCGCGATC TTGGCTCACC GCACCTTCCG 40800 CCTCCCGGGT TCGAGCGCTTCTCCTGCCTC CAGCCTCCCG ATTAGCGGGG ATTGACAGGG 40860 AGGCACCCCC ACGCCTGGCTTGGCTGATGT TTGTGTTTTT AGTAGGCACG CCGTGTCTCT 40920 CCATGTTGCT CAGGCTGGTCTCCAACTCCC GACCTCCTGT GATGCGCCCA CCTCGGCCTC 40980 TCGAAGTGCT GGGATGACGGGCGTGACGAC CGTGCCCGGC CTGTTGACTC ATTTCGCTTT 41040 TTTATTTCTT TCGTTTCCACGCGTTTACTT ATATGTATTA ATGTAAACGT TTCTGTACGC 41100 TTATATGCAA ACAACGACAACGTGTATCTC TGCATTGAAT ACTCTTGCGT ATGGTAAATA 41160 CGTATCGGTT GTATGGAAATAGACTTCTGT ATGATAGATG TAGGTGTCTG TGTTATACCA 41220 ATAAATACAC ATCGCTCTATAAAGAAGGGA TCGTCGATAA AGACGTTTAT TTTACGTATG 41280 AAAAGCGTCG TATTTATGTGTGTAAATGAA CCGAGCGTAC GTAGTTATCT CTGTTTTCTT 41340 TCTTCCTCTC CTTCGTGTTTTTCTTCCTTC CTTTCTTCCT TTCTCTCCTT CTTTAGGTTT 41400 TTCTTCCTCT CTTCCTTTCCTTCTTTCTCT CTTTCTGTCC TTTTTTCCTT CGTGCTTTAT 41460 TTCTCTTTCG TTCCCTGTGTTTCCTTCTTT TTTCTTTCCT CTCTGTTTCT TTTTCCCTTC 41520 TTTCCTTCGT TTCTTTCCTCATTCTTTCTC TCTTTTTCGT TGTTTCTTTC CTTCCCGTCT 41580 GTCTTTTAAA AAATTGGAGTGTTTCAGAAG TTTACTTTGT GTATCTACGT TTTCTAAATT 41640 GTCTCTCTTT TCTCCATTTTCTTCCTCCCT CCCTCCCTCC CTCCCTGCTC CCTTCCCTCC 41700 CTCCTTCCCT TTCGCCATCTGTCTCTTTTC CCCACTCCCC TCCCCCCGTC TGTCTCTGCG 41760 TGGATTCCGG AAGAGCCTACCGATTCTGCC TCTCCGTGTG TCTGCAGCGA CCCCGCGACC 41820 GAGTCCTTGT GTGTTCTTTCTCCCTCCCTC CCTCCCTCCC TCCCTCCCTC CCTCCCTGCT 41880 TCCGAGAGGC ATCTCCAGAGACCGCGCCGT GGGTTGTCTT CTGACTCTGT CGCGGTCGAG 41940 GCAGAGACGC GTTTTGGGCACCGTTTGTGT GGGGTTGGGG CAGAGGGGCT GCGTTTTCGG 42000 CCTCGGGAAG AGCTTCTCGACTCACGGTTT CGCTTTCGCG GTCCACGGGC CGCCCTGCCA 42060 GCCGGATCTG TCTCGCTGACGTCCGCGGCG GTTGTCGGGC TCCATCTGGC GGCCGCTTTG 42120 AGATCGTGCT CTCGGCTTCCGGAGCTGCGG TGGCAGCTGC CGAGGGAGGG GACCGTTTTG 42180 GCTGTGAGCT AGGCAGAGCTCCGGAAAGCC CGCGGTCGTC AGCCCGGCTG GCCCGGTGGC 42240 GCCAGAGCTG TGGCCGGTCGCTTGTGAGTC ACAGCTCTGG CGTGCAGGTT TATGTGGGGG 42300 AGAGGCTGTC GCTGCGCTTCTGGGCCCGCG GCGGGCGTGG GGCTGCCCGG GCCGGTCGAC 42360 CAGCGCGCCG TAGCTCCCGAGGCCCGAGCC GCGACCCGGC GGACCCGCCG CGCGTGGCGG 42420 AGGCTGGGGA CGCCCTTCCCGGCCCGGTCG CGGTCCGCTC ATCCTGGCCG TCTGAGGCGG 42480 CGGCCGAATT CGTTTCCGAGATCCCCGTGG GGAGCCGGGG ACCGTCCCGC CCCCGTCCCC 42540 CGGGTGCCGG GGAGCGGTCCCCGGGCCGGG CCGCGGTCCC TCTGCCGCGA TCCTTTCTGG 42600 CGAGTCCCCG TGGCCAGTCGGAGAGCGCTC CCTGAGCCGG TGCGGCCCGA GAGGTCGCGC 42660 TGGCCGGCCT TCGGTCCCTCGTGTGTCCCG GTCGTAGGAG GGGCCGGCCG AAAATGCTTC 42720 CGGCTCCCGC TCTGGAGACACGGGCCGGCC CCTGCGTGTG GCCAGGGCGG CCGGGAGGGC 42780 TCCCCGGCCC GGCGCTGTCCCCGCGTGTGT CCTTGGGTTG ACCAGAGGGA CCCCGGGCGC 42840 TCCGTGTGTG GCTGCGATGGTGGCGTTTTT GGGGACAGGT GTCCGTGTCC GTGTCGCGCG 42900 TCGCCTGGGC CGGCGGCGTGGTCGGTGACG CGACCTCCCG GCCCCGGGGG AGGTATATCT 42960 TTCGCTCCGA GTCGGCAATTTTGGGCCGCC GGGTTATAT 42999 (2) INFORMATION FOR SEQ ID NO: 18: (i)SEQUENCE CHARACTERISTICS: (A) LENGTH: 175 base pairs (B) TYPE: nucleicacid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE:Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENTTYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ IDNO: 18: CTCCCGCGCG GCCCCCGTGT TCGCCGTTCC CGTGGCGCGG ACAATGCGGTTGTGCGTCCA 60 CGTGTGCGTG TCCGTGCAGT GCCGTTGTGG AGTGCCTCGC TCTCCTCCTCCTCCCCGGA 120 GCGTTCCCAC GGTTGGGGAC CACCGGTGAC CTCGCCCTCT TCGGGCCTGGATCCG 175 (2) INFORMATION FOR SEQ ID NO: 19: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 755 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: GGTCTGGTGG GAATTGTTGA CCTCGCTCTC GGGTGCGGCC TTTGGGGAAC GGCGGGGTCG 60GTCGTGCCCG GCGCCGGACG TGTGTCGGGG CCCACTTCCC GCTCGAGGGT GGCGGTGGCG 120GCGGCGTTGG TAGTCTCCCG TGTTGCGTCT TCCCGGGCTC TTGGGGGGGG TGCCGTCGTT 180TTCGGGGCCG GCGTTGCTTG GCTTACGCAG GCTTGGTTTG GGACTGCCTC AGGAGTCGTG 240GGCGGTGTGA TTCCCGCCGG TTTTGCCTCG CGTCTGCCTG CTTTGCCTCG GGTTTGCTTG 300GTTCGTGTCT CGGGAGCGGT GGTTTTTTTT TTTTTCGGGT CCCGGGGAGA GGGGTTTTTC 360CGGGGGACGT TCCCGTCGCC CCCTGCCGCC GGTGGGTTTT CGTTTCGGGC TGTGTTCGTT 420TCCCCTTCCC CGTTTCGCCG TCGGTTCTCC CCGGTCGGTC GGCCCTCTCC CCGGTCGGTC 480GCCCGGCCGT GCTGCCGGAC CCCCCCTTCT GGGGGGGATG CCCGGGCACG CACGCGTCCG 540GGCGGCCACT GTGGTCCGGG AGCTGCTCGG CAGGCGGGTG AGCCAGTTGG AGGGGCGTCA 600TGCCCCCGCG GGCTCCCGTG GCCGACGCGG CGTGTTCTTT GGGGGGGCCT GTGCGTGCGG 660GAAGGCTGCG CACGTTGTCG GTCCTTGCGA GGGAAAGAGG CTTTTTTTTT TTAGGGGGTC 720GTCCTTCGTC GTCCCGTCGG CGGTGGATCC GGCCT 755 (2) INFORMATION FOR SEQ IDNO: 20: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 463 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 20: GGCCGAGGTG CGTCTGCGGG TTGGGGCTCG TCCGGCCCCGTCGTCCTCCG GGAAGGCGTT 60 TAGCGGGTAC CGTCGCCGCG CCGAGGTGGG CGCACGTCGGTGAGATAACC CCGAGCGTGT 120 TTCTGGTTGT TGGCGGCGGG GGCTCCGGTC GATGTCTTCCCCTCCCCCTC TCCCCGAGGC 180 CAGGTCAGCC TCCGCCTGTG GGCTTCGTCG GCCGTCTCCCCCCCCCTCAC GTCCCTCGCG 240 AGCGAGCCCG TCCGTTCGAC CTTCCTTCCG CCTTCCCCCCATCTTTCCGC GCTCCGTTGG 300 CCCCGGGGTT TTCACGGCGC CCCCCACGCT CCTCCGCCTCTCCGCCCGTG GTTTGGACGC 360 CTGGTTCCGG TCTCCCCGCC AAACCCCGGT TGGGTTGGTCTCCGGCCCCG GCTTGCTCTT 420 CGGGTCTCCC AACCCCCGGC CGGAAGGGTT CGGGGGTTCCGGG 463 (2) INFORMATION FOR SEQ ID NO: 21: (i) SEQUENCE CHARACTERISTICS:(A) LENGTH: 378 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS:single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii)HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi)ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: GGATTCTTCAGGATTGAAAC CCAAACCGGT TCAGTTTCCT TTCCGGCTCC GGCCGGGGGG 60 GGCGGCCCCGGGCGGTTTGG TGAGTTAGAT AACCTCGGGC CGATCGCACG CCCCCCGTGG 120 CGGCGACGACCCATTCGAAC GTCTGCCCTA TCAACTTTCG ATGGTAGTCG ATGTGCCTAC 180 CATGGTGACCACGGGTGACG GGGAATCAGG GTTCGATTCC GGAGAGGGAG CCTGAGAAAC 240 GGCTACCACATCCAAGGAAG GCAGCAGGCG CGCAAATTAC CCACTCCCGA CCCGGGGAGG 300 TAGTGACGAAAAATAACAAT ACAGGACTCT TTCGAGGCCC TGTAATTGGA ATGAGTCCAC 360 TTTAAATCCTTTAAGCAG 378 (2) INFORMATION FOR SEQ ID NO: 22: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 378 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: GATCCATTGG AGGGCAAGTC TGGTGCCAGC AGCCGCGGTA ATTCCAGCTC CAATAGCGTA 60TATTAAAGTT GCTGCAGTTA AAAAGCTCGT AGTTGGATCT TGGGAGCGGG CGGGCGGTCC 120GCCGCGAGGC GAGTCACCGC CCGTCCCCGC CCCTTGCCTC TCGGCGCCCC CTCGATGCTC 180TTAGCTGAGT TGTCCCGCGG GGCCCGAAGC GTTTACTTTG AAAAAATTAG AGTTGTTTCA 240AAGCAGGCCC GAGCCGCCTG GATACCGCCA GCTAGGAAAT AATGGAATAG GACCGCGGTT 300CCTATTTTGT TTGGTTTTCG GAACTGAGCC CATGATTAAG GGAAACGGCC GGGGGCATTC 360CCTTATTGCG CCCCCCTA 378 (2) INFORMATION FOR SEQ ID NO: 23: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 719 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: GGATCTTTCC CGCTCCCCGT TCCTCCCGGC CCCTCCACCC GCGCGTCTCC CCCCTTCTTT 60TCCCCTCTCC GGAGGGGGGG GAGGTGGGGG CGCGTGGGCG GGGTCGGGGG TGGGGTCGGC 120GGGGGACCGC CCCCGGCCGG CAAAAGGCCG CCGCCGGGCG CACTTCAACC GTAGCGGTGC 180GCCGCGACCG GCTACGAGAC GGCTGGGAAG GCCCGACGGG GAATGTGGCT CGGGGGGGGC 240GGCGCGTCTC AGGGCGCGCC GAACCACCTC ACCCCGAGTG TTACAGCCCT CCGGCCGCGC 300TTTCGCGGAA TCCCGGGGCC GAGGGGAAGC CCGATACCCG TCGCCGCGCT TTTCCCCTCC 360CCCCGTCCGC CTCCCGGGCG GGCGTGGGGG TGGGGGCCGG GCCGCCCCTC CCACGCCCGT 420GGTTTCTCTC TCTCCCGGTC TCGGCCGGTT TGGGGGGGGG AGCCCGGTTG GGGGCGGGGC 480GGACTGTCCT CAGTGCGCCC CGGGCGTCGT CGCGCCGTCG GGCCCGGGGG GTTCTCTCGG 540TCACGCCGCC CCCGACGAAG CCGAGCGCAC GGGGTCGGCG GCGATGTCGG CTACCCACCC 600GACCCGTCTT GAAACACGGA CCAAGGAGTC TAACGCGTGC GCGAGTCAGG GGCTCGCACG 660AAAGCCGCCG TGGCGCAATG AAGGTGAAGG GCCCCGTCCG GGGGCCCGAG GTGGGATCC 719 (2)INFORMATION FOR SEQ ID NO: 24: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:685 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: CGAGGCCTCT CCAGTCCGCCGAGGGCGCAC CACCGGCCCG TCTCGCCCGC CGCGTCGGGG 60 AGGTGGAGCA CGAGCGTACGCGTTAGGACC CGAAAGATGG TGAACTATGC CTGGGCAGGG 120 CGAAGCCAGA GGAAACTCTGGTGGAGGTCC GTAGCGGTCC TGACGTGCAA ATCGGTCGTC 180 CGACCTGGGT ATAGGGGCGAAAGACTAATC GAACCATCTA GTAGCTGGTT CCCTCCGAAG 240 TTTCCCTCAG GATAGCTGGCGCTCTCGCAA CCTTCGGAAG CAGTTTTATC CGGGTAAAGG 300 CGGAATGGAT TAGGAGGTCTTGGGGCCGGA AACGATCTCA AACTATTTCT CAAACTTTAA 360 ATGGGTAAGG AAGCCCGGCTCGCTGGCGTG GAGCCGGGCG TGGAATGCGA GTGCCTAGTG 420 GGCCACTTTT GGTAAGCAGAACTGGCGCTG CGGGATGAAC CGAACGCCGG GTTAAGGCGC 480 CCGATGCCGA CGCTCATCAGACCCCAGAAA AGGTGTTGGT TGATATAGAC AGCAGGACGG 540 TGGCCATGGA AGTCGGAATCCGCTAAGGAG TGTGTAACAA CTCACCTGCC GAATCAACTA 600 GCCCTGAAAA TGGATGGCGCTGGAGCGTCG GGCCCATACC CGGCCGTCGC CGGCAGTCGG 660 AACGGGACGG GACGGGAGCGGCCGC 685 (2) INFORMATION FOR SEQ ID NO: 25: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 33 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: GAGGAATTCC CCTATCCCTA ATCCAGATTG GTG 33 (2) INFORMATION FOR SEQ IDNO: 26: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 35 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 26: AAACTGCAGG CCGAGCCACC TCTCTTCTGT GTTTG 35(2) INFORMATION FOR SEQ ID NO: 27: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 33 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: AGGAATTCAC AGAAGAGAGGTGGCTCGGCC TGC 33 (2) INFORMATION FOR SEQ ID NO: 28: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 34 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: AGCCTGCAGG AAGTCATACC TGGGGAGGTG GCCC 34 (2) INFORMATION FOR SEQ IDNO: 29: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 80 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 29: AAACTGCAGG TTAATTAACC CTAACCCTAA CCCTAACCCTAACCCTAACC CTAACCCTA 60 CCCTAACCCT AACCCGGGAT 80 (2) INFORMATION FOR SEQID NO: 30: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 19 base pairs (B)TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii)MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO(v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 30: TTGGGCCCTA GGCTTAAGG 19 (2) INFORMATION FORSEQ ID NO: 31: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 25 base pairs(B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear(ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE:NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCEDESCRIPTION: SEQ ID NO: 31: GCCAGGGTTT TCCCAGTCAC GACGT 25 (2)INFORMATION FOR SEQ ID NO: 32: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH:26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D)TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL: NO(iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINAL SOURCE:(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: GCTGCAAGGC GATTAAGTTG GGTAAC26 (2) INFORMATION FOR SEQ ID NO: 33: (i) SEQUENCE CHARACTERISTICS: (A)LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Genomic DNA (iii) HYPOTHETICAL:NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE: <Unknown> (vi) ORIGINALSOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: TATGTTGTGT GGAATTGTGAGCGGAT 26 (2) INFORMATION FOR SEQ ID NO: 34: (i) SEQUENCECHARACTERISTICS: (A) LENGTH: 21 base pairs (B) TYPE: nucleic acid (C)STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: GenomicDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (v) FRAGMENT TYPE:<Unknown> (vi) ORIGINAL SOURCE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: GGGTTTAAAC AGATCTCTGC A 21

What is claimed:
 1. A method comprising: introducing a mammalianartificial chromosome that comprises nucleic acid encoding a therapeuticproduct into a cell of a host animal, wherein the therapeutic product isexpressed in the cell.
 2. The method of claim 1 wherein the artificialchromosome is a mammalian satellite artificial chromosome.
 3. The methodof claim 1 wherein the artificial chromosome is a satellite artificialchromosome that is between about 250 Mb and about 400 Mb.
 4. The methodof claim 1 wherein the artificial chromosome is a mammalian satelliteartificial chromosome that is between about 150 Mb and about 200 Mb. 5.The method of claim 1 wherein the artificial chromosome is a mammaliansatellite artificial chromosome that is between about 90 Mb and about120 Mb.
 6. The method of claim 1 wherein the artificial chromosome is amammalian satellite artificial chromosome that is between about 60 Mband about 100 Mb.
 7. The method of claim 1 wherein the artificialchromosome is a mammalian satellite artificial chromosome that isbetween about 20 Mb and about 200 Mb.
 8. The method of claim 1 whereinthe artificial chromosome is a mammalian satellite artificial chromosomethat is between about 100 Mb and about 200 Mb.
 9. The method of claim 1wherein the artificial chromosome is a mammalian satellite artificialchromosome that is between about 7.5 Mb and about 60 Mb.
 10. The methodof claim 1 wherein the artificial chromosome is a mammalian satelliteartificial chromosome that is between about 30 Mb and about 50 Mb. 11.The method of claim 1 wherein the artificial chromosome is a mammaliansatellite artificial chromosome that is between about 10 Mb and about 15Mb.
 12. The method of claim 1 wherein the artificial chromosome is amammalian satellite artificial chromosome that is between about 15 Mband about 50 Mb.
 13. The method of claim 1 wherein the artificialchromosome is a mammalian satellite artificial chromosome that isbetween about 1 Mb and about 15 Mb.
 14. The method of claim 1 whereinthe artificial chromosome is a mammalian satellite artificial chromosomeproduced by a method comprising: introducing one or more nucleic acidfragments into a cell, wherein the nucleic acid fragment or fragmentscomprise a selectable marker; growing the cell under selectiveconditions to produce cells that have incorporated the nucleic acidfragment or fragments into their genomic DNA; and selecting a cell thatcomprises a mammalian satellite artificial chromosome.
 15. The method ofclaim 1 wherein the artificial chromosome is a mammalian satelliteartificial chromosome comprises a human centromere.
 16. The method ofclaim 1 wherein the artificial chromosome is a minichromosome.
 17. Themethod of claim 16, wherein the minichromosome is a λ neo-chromosome.18. The method of claim 16, wherein the minichromosome comprises aneocentromere.
 19. The method of claim 16, wherein the minichromosome isproduced by a method comprising: introducing one or more nucleic acidfragments into a cell, wherein the nucleic acid fragment or fragmentscomprise a selectable marker; growing the cell under selectiveconditions to produce cells that have incorporated the nucleic acidfragment or fragments into their genomic DNA; and selecting a cell thatcomprises a minichromosome.
 20. A method comprising: introducing amammalian artificial chromosome that comprises nucleic acid encoding atherapeutic product into tissue of a host animal, wherein thetherapeutic product is expressed in cells of the tissue.
 21. The methodof claim 20, wherein the artificial chromosome is a mammalian satelliteartificial chromosome.
 22. The method of claim 20, wherein theartificial chromosome is a mammalian satellite artificial chromosomeproduced by a method comprising: introducing one or more nucleic acidfragments into a cell, wherein the nucleic acid fragment or fragmentscomprise a selectable marker; growing the cell under selectiveconditions to produce cells that have incorporated the nucleic acidfragment or fragments into their genomic DNA; selecting a cell thatcomprises a mammalian satellite artificial chromosome.
 23. The method ofclaim 20, wherein the artificial chromosome is a mammalian satelliteartificial chromosome comprises a human centromere.
 24. The method ofclaim 20, wherein the artificial chromosome is a minichromosome.
 25. Themethod of claim 20, wherein the minichromosome is a λ neo-chromosome.26. The method of claim 20, wherein the minichromosome comprises aneocentromere.
 27. The method of claim 20, wherein the minichromosome isproduced by a method comprising: introducing one or more nucleic acidfragments into a cell, wherein the nucleic acid fragment or fragmentscomprise a selectable marker; growing the cell under selectiveconditions to produce cells that have incorporated the nucleic acidfragment or fragments into their genomic DNA; and selecting a cell thatcomprises a minichromosome.
 28. A method comprising: introducing amammalian artificial chromosome that comprises nucleic acid encoding agene product into a cell of a host animal, whereby the gene product isexpressed in the cell.
 29. The method of claim 28, wherein the mammalianartificial chromosome is a mammalian satellite artificial chromosome.30. The method of claim 29, wherein the satellite artificial chromosomeis isolated prior to introduction into the cell.